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§ 1. The Riemann Integral 


The theory of integration expounded in this Chapter dates from the XIX‘ 
century; it was, and remains, of great use in classical mathematics, and its 
simplicity has rewarded all who have written for beginners in the subject. 
For professional mathematicians it has been dethroned by the much more 
powerful, and in some respects simpler, theory invented by Henri Lebesgue 
around 1900, and perfected in the course of the first half of the XX* century 
by dozens of others; we present a small part of it in the Appendix to this 
Chapter. The “Riemann” theory expounded in this Chapter therefore has 
only a pedagogic interest. 


1 — Upper and lower integrals of a bounded function 


Let us first recall the definitions of Chap. IT, n° 11. 

A scalar (i.e. complex-valued) function y defined on a compact, or more 
generally, bounded, interval I is said to be a step function if one can find a 
partition (Chap. I) of J into a finite number of intervals J, such that y is 
constant on each J;,; no conditions are imposed on the J;. Such a partition 
will be said to be adapted to y. 

When I = (a,b) this is the same as requiring the existence of a finite 
sequence of points of J satisfying 


(1.1) G=% <9. <...< ani. =b 


and such that »y is constant on each open interval |x,,2%41[, because the 
values it takes at a point x; have no connection with those it takes to the 
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right or left of this point, and are irrelevant to the calculation of traditional 
integrals?. 

A sequence of points satisfying (1) is called a subdivision of the interval I. 
A subdivision by the points y;, is said to be finer than the subdivision (1) 
when the xz appear among the yp, in other words when the second subdivi- 
sion is obtained by subdividing each of the component intervals in (1). The 
definition is similar for two partitions (J;,) and (J),) of I: the second is said 
to be finer than the first if every J, is contained in one of the J,, in other 
words if the second partition of J is obtained by partitioning each of the I, 
themselves into intervals (namely, those J}, contained in J;,). 

If p(x) = ax for every x € Ip one calls the number 


(1.2) my) = os a.m) = y. 9(&.)mUx) 


the integral of yp over I, where, for every interval J = (u,v), the number 
m(J) = v—u denotes the length or measure of J, and where €, is any point 
of Ip. Since the I, of zero measure do not matter in (2) one can replace the 
partition by a subdivision (1) and write 


(1.3) my) = S- p(&k) (er — Lr) with rp < €% < p41 


since y is constant, so equal to y(&), on |x, &e+41[- 

Since there are infinitely many ways of choosing the J, — every finer 
partition, for example, will equally be adapted to calculating the integral -, 
we have to show that the sum (2) does not depend on the choice of the I;,. So 
let (J;,) be another partition of I into intervals such that y(a) = b,, for every 
x € Jp. Since each J; is the union of the pairwise disjoint intervals I, M Jn, 
as is shown by the relation 


K=eXOL=XO| | ieea (IX sy; 
valid for every subset X of I, we have 


m(In) = So m(In A In) 


h 


and similarly 


m(Jp) = So m(In N Jn) 


k 


where, by convention, m(@) = 0. Thus 


(1.4) S/ axm(In) = So axm(In Jn), 
(1.5) So onm(In) = So bamUn A Jn), 


' This is not the same in generalisations of the classical theory. See n° 30. 


§ 1. The Riemann Integral 3 


where, on the right hand sides, we sum over all the pairs (k,h). We thus have 
only to prove that 


m(IpN Jn) #0 implies az = bp, 


which is clear: on JI, 1 J, which is nonempty since its length is not zero, the 
function vy is equal simultaneously to az, and to bp. 
This argument shows immediately that 


(1.6) m(Ay + wp) = Amy) + pm(w) 


for any step functions y and w and constants and yp: consider partitions 
(J;,) and (J;,) of I adapted to y and w, write a; for the value of y on I, and 
bp for that of ~ on Jp, and calculate the integrals of y, y and Ay+ uw using 
the intervals I, Jj, on which y, w and Ay+ pw are equal respectively to ax, 
bp and Aax + bp; in effect we are adding the relations (4) and (5), multiplied 
respectively by A and 4, term-by-term. 

Since it is clear that the integral of a positive function (i.e. one whose 
values are all positive) is positive, we see that 


(1.7) ys implies = m(y) < m(y) 
for real-valued y and 7%, since m(w) —m(y) = m(w— vy) = 0 by (6) and p—y 
is positive. 

Finally, the triangle inequality applied to (2) shows that 


Im(e)| < do lelEx)lmUx) = mel) < So llellemUn) 


always, where, as in Chap. III, n° 7, we write in a general way that 
Il fllz = sup |f(«)]. 
cel 
Since }> m(I;,) = m(J) we finally obtain the inequality 
(1.8) |m(y)| < mle) < m(UD)Ilellr- 


This completes the “theory” of integration as it applies to step functions. 
It rests on two properties of lengths which are the starting point for all later 
generalisations: 


(M 1): the measure of an interval is positive; 
(M 2): measure is additive, i.e. if an interval J is the union of a finite number 
of pairwise disjoint intervals J, then m(J) = >> m(Jx). 


There are many other interval-functions which have these properties. One 
can, for example, choose a continuous function (a) which is increasing in 
the wide sense on I and put? 

? For an arbitrary increasing function one has to take account of its discontinuities 


and modify the formula to obtain a reasonable theory. See n° 32 on Stieltjes 
measures. 
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uJ) = wv) — w(u) if J = (u, 0). 


One can also take a finite or countable set D C J and assign to each € € D 
a “mass” c(€) > 0, with S>c(€) < +00, and then put 


ged 


for every interval J, so that the measure of a singleton interval can very well 
be > 0; in this example property (M 2) reduces to the associativity formula 
for absolutely convergent series. We obtain discrete measures in this way. 

For a “measure” ju satisfying (M 1) and (M 2) the integral of a step func- 
tion is, by definition, the number p(y) given by the formula (2), replacing 
the letter m by the letter jy. For a discrete measure, one clearly finds that 
u(y) = >> c(€)y(E), summing over all the € € D. These generalisations will 
be studied at the end of this chapter, but the reader may be interested to ob- 
serve, every time we use the traditional integral, those results which depend 
only on the properties (M 1) and (M 2) of “Euclidean” or “Archimedean” 
measure, or, as one now calls it, of “Lebesgue measure” (since it was for 
this that Lebesgue constructed his grand integration theory) because these 
properties extend to the general case. Certain results which, on the contrary, 
use the explicit construction starting from the usual measure, mainly concern 
the relations between integrals and derivatives, Fourier series and integrals, 
partial differential equations, almost all applications to physical sciences, etc. 
They rest on an obvious though fundamental property of the usual measure: 
it is invariant under translation; see below, (2.20). 


Now let us pass on to arbitrary bounded real functions on a bounded 
interval I (in general compact). 

Given a bounded real-valued function f on J there exist step functions, 
even constant functions, y and w, such that y < f < wd, ie. v(x) < f(x) < 
w(x) for every x € I. By (7) we must have m(y) < m(w), and every rea- 
sonable definition of m(f) must satisfy m(y) < m(f) < m(w). We therefore 
examine the lower and upper integrals of f over I defined by the formulae 


(1.9) ms(f) = supra) m*(f) = ane) 


where y and w range over the sets of step functions such that py < f < w. 

As we have seen in Chap. II, n° 11, we have m,(f) < m*(f) since every 
number m(v) is less than the m(w), so is less than their lower bound m*(f), 
which, larger than all the m(y), is also larger than their upper bound m.(f). 
Since the constant functions equal to —||f||; and +||f||; feature among the 
functions y and w respectively, we even have 


(1.10) —m(I)||fllz < m(f) < m*(f) < mI Fllz- 
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Relation (6) does not extend to the lower and upper integrals of arbitrary 
functions; if it did, the theory of integration would finish with n° 2 of this 
chapter. However, we always have the inequalities 


(1.11) ma(f+g) 2m.(f) +m.(g), — m*(f+g) <m*(f) + m*(g). 


Among the step functions less than f +g are the sums y+ y, where vy is less 
than f and where ~ is less than g; consequently, m,(f +g) is greater than 
all the numbers of the form m(y + w) = m(y) + m(w). It remains to note 
that if A and B are two sets of real numbers, and if one writes 4+ B for the 
set of numbers x + y where x € A and y € B, then 


sup(A + B) = sup(A) + sup(B) 


with a similar relation for the lower bounds (exercise!), so that every number 
larger than the x+y is larger than sup(A)+sup(B). Whence the first relation 
(11). The second is proved in the same way, reversing the inequalities. 

It is easier to show that 


(1.12) m,(cf) = cm.,(f), m*(cf) = cm*(f) for every c > 0 


and 


(1.13) mx(—f) = —m*(f); 


it is enough to note that multiplication by —1 transforms the step functions 
below f into those above —f. 


2 — Elementary properties of integrals 


The most natural definition of integrable functions with real values is that 
they should satisfy the condition 


m*(f) =m.(f); 


the common value of the two sides then being the value of the integral m(/f) 
of f; one extends the definition to functions f = g + 7h with complex values 
by requiring both g and h to be integrable and putting 


m(f) = m(g) + im(h). 


This definition, adopted in the First French Edition for reasons of simplicity, 
has several drawbacks; in particular, it is not obvious — although, of course, 
true — that the absolute value || = [Re(f)?2+Im(f)?]? of a complex-valued 
integrable function is again integrable, as Michel Ollitrault, a reader of the 
First Edition, has justly remarked to me. We shall therefore abandon this 
definition temporarily, to recover it later, and we shall adopt a method used 
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in the modern theory too. We shall develop it for complex-valued functions, 
but it will also apply to functions with values in a finite dimensional vector 
space, or even a Banach space, which is not the case for the first simplistic 
definition. 

We shall say that a function f is integrable if, for any r > 0, there is 
a step function y (with values in the same space as f if one is integrating 
vector-valued functions) such that 


(2.1) m*(|f — el) <r. 


If f has real values this means, intuitively, that the numerical (and not al- 
gebraic) measure of the area in the plane included between the graphs of f 
and y is <r; there is no point in assuming y “above” or “below” f. It comes 
to the same to require the existence of a sequence of step functions y,, such 
that 


(2.1) lim m*(|f — gnl) =0 


or, as one says, which converges in mean to f. One says “in mean” because 
the fact that the upper integral of a positive function h(a) is very small 
does not prevent h from taking very large values on very small intervals: 
1919914 9—200 S 197100, 

To define the integral of an integrable function f one uses the relation 
(1’). By the triangle inequality we have 


lop — Gql < lMp — FI HIF -— Yel 


and so 


me) — M(Pq)| = IMU — ¥a)| S "(hp — FI) + MIF — val), 


by (1.11). The sequence with general term m(y,,) therefore satisfies Cauchy’s 
convergence criterion (Chap. III, n° 10, Theorem 13). Its limit depends only 
on f. For if q, is another sequence of step functions satisfying (1’) the relation 


lon — Vn| < lf —¢n| + |f — val 


shows, in a similar way, that m(y~n) — m(w,) tends to 0. 

It is natural to call the limit of the m(y,,) (common to all sequences of 
step functions converging to f in mean) the integral of f, and to denote it by 
m(f). This kind of argument, used in many other places, is similar to the one 
we used to define a” for a > 0 and x € R, by approximating x by a sequence 
of rational numbers x, and showing that the sequence a*” converges to a 
limit independent of « (Chapter IV, § 1, end of n° 2). 

If an integrable function f has real (resp. positive) values then its integral 
is real (resp. positive). If f is real, and if in (1’) one replaces y, by Re yn 
one decreases the function |f — y,| and so its upper integral, so that the 
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sequence of real functions Re(y,,) again converges to f in mean, whence the 
first result. If, moreover, f is positive, in which case one may assume the y, 
real, one argues in the same way, replacing the y,,(a) by 0 on the intervals 
where y,, < 0: this can only decrease the value of |f(#) — Yn(x)|, and so of 
the upper integral. 


If f and g are integrable then f +g is integrable and 


m(f +g) =m(f) + mg). 


Take step functions y, and w,, converging in mean to f and g, write 


(Ff +9) — (Gn + br) SIF — Gnl +19 — dnl 


to show that yr + vw, converges to f + g in mean, and use (1.6). 

If f is integrable then so is af for anya € C, and m(af) = am(f). 
Obvious: multiply f and y by a in (1) and apply (1.12). 

These first results already show, for real integrable f and g, that 


f <g implies m(f) < m(qg), 


since 0 < m(g— f) = m(g)+m(—f) = m(g) — m(f). 
If f is integrable then so is |f|, and 


(2.2) Im(F)| < mF) < mZ) Ill 


where, we recall, || f||; = sup |f(«)| is the norm of uniform convergence on I 
(Chap. III, n° 7). For any complex numbers a@ and ( we have ||| _ [A|| < 
|a — B|, whence, in the notation of (1’), 


[I F(2)| — len(2)|] < Ife) - Yn(x)| for all w eT 


and so m*(|f| — |gnl) < m*(|f — Yn); this proves that |f| is integrable like 
f, since the |y,| are also step functions. Since the integrals of y, and |yn| 
converge to those of f and |f|, by definition of the latter, and since (2) applies 
to the yp, one obtains the first inequality (2) in the limit. The second follows 
from the fact that |f(x)| < ||f||,; everywhere on I, so that m(|f|) is less than 
the integral of the constant function x + || f|| ;. 


The complex-valued function f is integrable if and only if the functions 
Re(f) and Im(f) are. If so, 


m(f) = m[Re(f)] + i-m[Im(f)]. 


Since |Re(f) — Re(yn)| < |f — ¢n|, with a similar relation for the imaginary 
parts, it is clear that Re(f) and Im(f) are integrable if f is; the relation to 
be shown then follows from the linearity properties already obtained; these 
show no less trivially that f is integrable if Re(f) and Im(f) are. 
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A real function f is integrable if and only if m*(f) = m.(f). 

Suppose first that m,(f) = m*(f). Then, for every r > 0 there are step 
functions y and w framing f whose integrals are equal to within r. Since 
lf-v| = f-v < ¢—y¥ it follows that m*(|f —~|) < m(y-Y) <r, whence 
the integrability of f. 

Suppose conversely that f is integrable and consider a step function y 
such that 

m* (If — el) <7; 


one may assume ¢ real as above. Since m*(|f — yl) is, by definition, the lower 
bound of the numbers m(w) over all step functions w > |f — yl], the strict 
inequality proves the existence of a step function w such that 


If-el<o & = my) <r. 


Since p—w < f < y+y we have thus framed f between two step functions 
whose difference has integral < 2r; so m*(f) = m.(f). Moreover, 


mp —%) <m*(f) <m(y+y); 


since f is integrable we already know that this relation is preserved if one 
replaces m*(f) by m(f), whence m(f) = m*(f), since the extreme terms in 
the preceding relation are equal to within 2r. 

To sum up: 


Theorem 1. Let I be a bounded interval. (i) If the bounded functions f and 
g are integrable on I, then so likewise is af + Gg for any constants a and £, 
and 


(2.3) maf + Bg) = am(f) + bm(g). 


(ii) If f ts defined, bounded and integrable on I, then the function |f| is 
integrable, and 


(2.4) Im(f)| < ml) < mf llz = mC). sup |f(@)I. 
(itt) The integral of a positive function is positive. 


The standard notation 
m(f) = | fleas 


will be explained later (n° 3). 

The definition of integrable functions shows immediately that, on a com- 
pact interval, every regulated function is integrable; for every r > 0 there 
exists, by the definition (Chap. III, n° 12) a step function y such that 
| f(x) — y(x)| < r for every x; then, by (1.10), m*(|f — y]) < m(Z)r, whence 
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the result. We shall prove later (n° 7) that, on a compact interval, every con- 
tinuous function is regulated, so integrable. One hardly needs more subtle 
results in elementary analysis. 

It is not difficult to construct non-integrable functions: it is enough to 
take the Dirichlet function f(x) on J, equal to 0 if x € Qand tolifxéQ 
Now, if a step function y < f is constant on the intervals J;, of a partition 
of J, it must be < 0 on every nonsingleton J; since such an interval contains 
rational numbers where f(x) = 0; likewise, every step function w > f must be 
“almost” everywhere > 1. Thus m.(f) = 0 and m*(f) = m(Z). The Lebesgue 
theory allows one to integrate the function f, with the same result as if one 
had f(a) = 1 everywhere, and this because Q is countable. It may appear 
bizarre to consider such functions — Newton would have said that one does 
not meet them in Nature? —, but it is one of those which led Cantor towards 
his great set theory, not to be confused with the trivialities of Chap. I. Even 
though the function in question is strange, one cannot deny it the merit of 
simplicity; if analysis is incapable of integrating such functions, one might 
begin to suspect that this is the fault of analysis and not of the function ... 


We said above that the integral of a positive function is positive; could it 
perhaps be zero? This is one of the fundamental questions which the complete 
Lebesgue theory allows one to resolve. For the moment we make just two 
elementary remarks. 

If the integral of a continuous positive function f is zero, then f = 0. For 
if we have f(a) = r > 0 for some a € I, then the continuity of f shows that 
f(a) > r/2 on an interval J C I of length > 0; if y is the step function equal 
to r/2 on J and to 0 elsewhere then m(f) > m(y) = rm(J)/2 > 0. 

This result (which presupposes the integrability of the continuous func- 
tions and uses the fact that, in the traditional theory, the measure of a non- 
empty open interval is > 0) does not extend to discontinuous functions. For a 
positive step function for example, it is clear that the integral vanishes if and 
only if the points where the function does not vanish are finite in number. In 
the much more general case of a regulated function, the apposite condition 
is that the set defined by the relation f(x) 4 0 should be countable (n° 7). 


Before stating the next theorem let us note that if we have real functions 
f and g defined on any set X we can construct the functions 


sup(f,g) : x max[f(x),9(x)], 
inf(f,g) : e+ min{f(x), 9(2)]; 


these definitions generalise in the obvious way to a finite number of functions 
(and even to an infinite number on replacing max and min by sup and inf) and 


3 We will meet them in computer science when there exist machines capable of 
distinguishing the rational numbers automatically from the others. 
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Fig. 1. 


lead us to the upper and lower envelopes of the given functions. In particular, 
for every real function f we can define the functions 


ft =sup(f,0) : 2h f(a), 
f~ =sup(-f,0) : rH f(x), 
ff : cH [f(a)| 
where, for every real number, we put (Chap. IT, n° 14) 


xt =max(z,0), x =max(—2,0); 


it is trivial to show that, for every x € R, 


f=a'-—2 , |e] = at + a7 
with similar relations for real-valued functions. An elementary argument, 
which Figure 1 makes obvious, shows that 


sup(f.g)=f+(g-f)*, imf(f,9)=9-(9-f)*; 


these operations are defined pointwise, using only the values taken at each 
xz € X by f and g, so these relations follow from the same relations for real 
numbers. See Chap. II, n° 14, where this notation has already been used. 


Theorem 2. I[f the real functions f and g are integrable on I, so are the 
functions sup(f,g) and inf(f,g). 


By Theorem 1 and the formula above it is enough to show that if f is 
integrable then so is f+. This follows immediately from the definition, (1) or 
(1’), and from the inequality |f* — yt] < |f — y. 

The preceding “theorem” shows more generally that the upper and lower 
envelopes of a finite number of integrable real functions are again integrable. 
When we try to extend this result to a countable family of functions we 
embark on integration theory proper; see Appendix (L 16). 


Theorem 3. Let f and g be two bounded integrable functions on a compact 

interval I. Then the function fg is integrable and (Cauchy-Schwarz inequal- 

ity’) 

# Hermann Amadeus Schwarz, German mathematician of the end of the XIX" cen- 
tury. The Soviet mathematicians remarked several decades ago that one ought to 
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(2.5) |m(Fa)I? < m(|fl?)m((9l?).- 


In checking that fg is integrable we may assume f and g real, and even 
positive, since every integrable real function f is the difference of the in- 
tegrable functions ft and f~. Given r > 0 we may choose positive step 
functions y’ and y” framing f, and w’ and Ww” framing g, both less than a 
fixed constant M which simultaneously majorises f and g. The product fg 
is framed by y’wW’ and y’w", so we need only evaluate the integral of the 
difference 


ply"! = yy! = wg" _ y') fs yl (wy! _ w’) 
< Mp" — 9) + MY" — o'). 


The integrals of y” — y’ and w” — y’ can be chosen to be < r/2M, making 
that of yw" — y'y’ <r by a suitable choice of these functions, whence the 
integrability of the product. 

An immediate consequence of this result is that if f is integrable on I 
and if J C J is an interval, then the function 


(2.6) xva(x) f(x) = f(x) on J, =0 onlI—-J, 


is again integrable. On multiplying step functions y,, converging in mean to f 
on I by the characteristic function 7 of J (Chap. I) one finds step functions 
converging in mean to yf. Since it is clear that 


(2.7) i, fla)de = | aa) fader 


is true for the yp we get the same result for f. From this we deduce that 


(2.8) [ soe = pay f(x)dx 


if the intervals J, form a partition® of J: the function x, is actually the sum 
of the characteristic functions of the Jp. This is the additivity (it would be 


speak of the Cauchy-Buniakowsky-Schwarz inequality, but their ancestor being 
less known, even unknown, compared to the other two, the “Matthew effect” to 
which we have alluded in Chap. III, n° 10, has applied in his case. Moreover, 
in my youth, we spoke simply of the Schwarz inequality, despite the fact that 
Cauchy already had quite a reputation ... 
This hypothesis is not needed in the case of the usual measure — it is enough 
that the intersections J, J, contain at most one point — for the integral over an 
interval J C I is clearly unchanged if one adjoins the end-points to J. But it is 
essential in the case of a measure which includes discrete masses. This explains 
the need to integrate over bounded rather than compact intervals: it is impossible 
to construct a non-trivial finite partition of an interval into compact intervals. 


o 
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better to say: the associativity) of the integral considered as a function of 
the interval of integration, and not of the function being integrated. This 
confirms in passing the existence of many interval functions that enjoy the 
properties (M 1) and (M 2) of n° 1: choose a positive integrable function p 
and put 


physically, this is a “distribution of mass” having a “density” p(x) at each 
point x € I; we write ju(J) for the total mass of the interval J; the traditional 
integral is obtained when p(x) = 1, a “homogeneous” distribution of mass. 

The proof of (5) is an exercise in algebra (Appendix to Chap. III) not 
specifically to do with integration theory; more exactly, it follows from the 
formal properties (i) and (iii) of Theorem 1 alone, and not from the explicit 
definition of an integral. We call the number 


(2.9) (fg) = m(f9) 


the scalar product of the functions f and g on the interval J. The inequality 
to be established is then 


(2.10) (FIM? < FIAGI9)- 


It is clear that (f |g) is a linear function of f for g given, that (f |g) = (g,f), 
and that (f | f) > 0 for any f. For every constant z € C we then have 


(2.11) (f+29f+29) = (FILA) + (29) + (9, f) + (29,29) = 


= c+bz+bz+azz>0 for every z€C, 


with c = (f|f), 6 = (f|g) and a = (g|g), notation chosen to evoke the 
well-known second degree trinomials, although here the variable is complex; 
we know in advance that a and c are > 0. 

If a 4 0 we can put z = —b/a, a value for which the right hand side of 
(11) can be written c—bb/a—bb/a+abb/a? = (ac—bb)/a; since the left hand 
side of (11) is > 0 like a, the numerator of the result is > 0, whence (10) in 
this case. 

If a = 0, the expression (11) cannot be > 0 for every z unless b = 0, in 
which case (10) does not require proof. Indeed, if we replace z by tz with 
t € R, we must then have (bZ+ bz)t > —c for any t, which forces bz + bz = 0, 
whence b = 0 since z € C is arbitrary. 


The Cauchy-Schwarz inequality shows that 


(fF+olft9=(F1N+¢IN4+G.N4+Gl9 = 
= (f| f) + 2Re(f|9)+ Gla) < U1 f) +2I(F 191+ Gly) < 
<(f1 A) +201 APG 9)'? + (G19), 
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whence, on taking the square roots of the two sides, 


(2.12) Gag pg Sg) aig) ols 


The expression 


1/2 
1/2 
(2.18) fle =f? =m(\fP?) = (/ J )Pae 
is called the L? norm of the function f on I; the inequality (12) shows that 


(2.14) If + gll2 < IIfllo + lglle 


and clearly ||Af||2 = |A\.||fll2 for every constant 1 € C. This justifies the 
word “norm”, apart from the fact that the norm can be zero for functions 
which are not identically zero. The calculation preceding formula (12) also 
shows that 


(2.15) (f|9) =0 = If +912 = Ilo + Illa, 


the integral version of Pythagoras’ Theorem; one says then that f and g are 
orthogonal. 
We define the L! norm also, by 


(2.16) fla = m(1F1)s 
and we again have (14) in this case, and much more easily, since |f + g| < 
[fl + Igl- 


For every real number p > 1, one defines more generally the L? norm 
by 


(2.17) N,(f) = m (fl?) = |Ifllp; 


n° 14 on convex functions will show that again in this case 


(2.18) If + gllp < IIfllp + Ilgllp 
and that 
(2.19) (Fla <Mfllp-lglle = if 1/p+1/q=1; 


these are the famous (but, at our level, largely useless) Minkowski 
and Holder inequalities. 


As for the notation L?, or L' or L”, these allude to the “grand” inte- 
gration theory. These calculations play a fundamental rdle in the theory of 
Fourier series, as we shall see a little below. 
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On several occasions we have remarked that the explicit construction of 
the integral does not feature in establishing Theorems 2 and 3, nor, as we 
shall see, in many other cases. It occurs elsewhere because of the translation 
invariance of the Euclidean measure of length. To translate this into the 
language of integration one writes the formula 


(2.20) [- f(x)dz = if f(at+ojdz, 


+e 


to be interpreted as follows: if z+ f(x) is integrable on [a+ c,b+ c], then 
xt+ f(x+c) is integrable on [a, b] and (20) holds. In other words, if one has 
an integrable function f on an interval J and if one submits both J and the 
graph of f to the same horizontal translation, then nothing changes. This is 
quite clear for step functions, and we leave the epsilontics for the reader to 
check. 

This result may appear (and is) trivial. Yet not only is it of constant use, 
it characterises Euclidean measure up to a constant factor among all those 
measures which satisfy conditions (M 1) and (M 2) of n° 1. This is also the 
key to the generalisations of Fourier analysis to group theory, a boom topic 
for more than fifty years. 

To give an application we shall use in n° 5, let us consider a function f(x) 
of period 1 on R and show that 


(2.21) [soe = [ seve. 


in other words that the left hand side is independent of a. To do this we 
consider the integer n such that a < n < a+1. By the additivity formula (8), 
the integral over [n,n + 1] is the sum of the integrals over [n,a + 1] and 
[a+1,n+1]. By (20), the second is also the integral over [a,n] of the function 
xr f(a+1) = f(x). The integral over [n,n + 1], equal for the same reason 
of periodicity to the integral over [0,1], is thus the sum of the integrals of f 
over [a,n] and [n,a + 1], which is the integral on [a,a+ 1], qed. 


3 — Riemann sums. The integral notation 


The relation (1.2) or (1.3) allows one to show how to calculate the integral 
of a complex-valued function f approximately from the Riemann sums (or 
Cauchy, not to go back to Fermat or even to Archimedes ...). Assume f 
regulated, enough for elementary use, and, given a number r > 0, let (Ix) 
be a partition of J into intervals on each of which f is constant to within 
r. Choose a € € J, at random in each J; and consider the step function y 
which on each I; takes the value cy = f(&,); now |f(x) — y(x)| < r for each 
x € I, so ||f — yl; <r, whence, by (2.4), 
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(3.1) | mF) — 5 me) f(Ex)| < m(D)r- 


On replacing this partition by a subdivision 
(3.2) G=%,<%Q2<...<8ni, =) 


of J as in n° 1, and choosing a point €, at random in the open inter- 
val ]xp,%x%+41[, we obtain 


(3.3) | mf) — So P(E) (@e41 — we)| < m(Dr; 


the fact that a singleton interval is of zero measure, which does not feature 
in deriving (1), justifies (3) in the case of the usual measure. We may note 
that this argument applies verbatim to vector-valued functions. 

What is more, these inequalities remain valid for every partition finer 
than (J,); for they rely only on the hypothesis that f is constant to within 
r on each of these intervals, a hypothesis true also for every partition finer 
than (Ip). 

Relation (3) explains the notation 


m(f)= | fede = [ "lade 


used to denote an integral. In this notation, 


io) = [ setae, Islla= ( i; ear) Ish = f (stool 


The analogy with the notation for series would be complete if one wrote 


is f(x)de or | cep f@)e 02 i _ Slee 


It seems quite curious that the sign [, invented by Leibniz in 1675, appeared 
fully 150 years before the sign > of which one finds no trace in Fourier nor 
in Cauchy’s Cours d’analyse of 1821. On the other hand, Leibniz and his 
XVIII century successors never wrote the limits of integration explicitly, 
which can be rather a nuisance; the modern notation appeared in Fourier’s 
Théorie analytique de la chaleur of 1822; but in 1807, when he was composing 
his fundamental memoir, refused by the Académie des sciences, Fourier still 
wrote, for example, S(sin .cyadx) for what we now write as 


20 
/ p(x) sina dx. 
0 


Leibniz’ notation is explained by his conception of the integral, inherited 
from certain of his predecessors and notably from the Italian Cavalieri. For 
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them it was to calculate the area bounded by the axis Oz, the graph of f, 
and the verticals through the end-points of J. They imagined I to be com- 
posed of “infinitely small” or “indivisible” intervals, which Leibniz denoted 
by (x,a + dx) and, consequently, that the area to be calculated is composed 
of infinitely thin vertical slices having these intervals for bases and the num- 
bers f(x) as their heights. The area of such a slice is “clearly” f(x)dx, so 
that the area to be calculated is the “continuous sum” (in contrast to the 
“discrete sum”, i.e. to the series) of these infinitesimal areas; whence the 
notation, in which the sign f{ is an abbreviation of the word “sum” or of 
its Latin equivalent. All this is metaphysics. But since, three centuries after 
Leibniz, Mankind has not felt the need to change his notation, whether deal- 
ing with integrals for neophytes or with their most abstract generalisations, 
it looks as though no one knows how to do better. 

Before Leibniz, Cavalieri used the word “omnia”, all, or “omn.”, instead 
of the sign [; after reading Cavalieri, Leibniz wrote in 1675 in a Latin that 
one can understand untaught, “Utile erit scribi f pro omn. ut [J pro omn. 
1 id est summa ipsorum /” (Cantor, III, p. 166; chez Cavalieri, one adds the 
lengths, denoted 1). Others, like Wallis and Newton, wrote a square before 
the integrand®, as in the formula 


go? = 6°/3— 7/3, 


the square evoking the word “quadrature” which, at the time, meant pre- 
cisely: to construct a square whose area is equal to the area bounded by a 
curve, as in the problem of the “quadrature of the circle”. Here again we see 
to what extent the choice of good notation can contribute to the advancement 
and to the comprehension of mathematics. 

Further, Leibniz’ notation led directly to the definition of the integral 
given by Cauchy. Instead of considering the infinitesimal expressions f(a)dx 
Cauchy used a subdivision of J as above, and considered the sum 


Sf (ex) (era — tk), 


traditionally denoted }* f(a,) Ax, because the letter A is the initial of the 
word “difference”. The integral of f is, for him, the limit of these sums as 
the subdivision becomes finer — which is indeed the case, as we shall see, for 
continuous functions. 


4 — Uniform limits of integrable functions 


The relation 


® During his controversies with Leibniz at the start of the XVIII" century, Newton 
claimed to be the first to have invented a symbol to denote an integral. Quite 
possible, but his was perfectly unusable, principally because of its typographical 
clumsiness. Leibniz’ notation is furthermore neatly adapted to the change of 
variable formula, to multiple integrals, etc. 
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(4.1) Im(f)| < mF) < mO)IIFllz, 


valid for every bounded integrable function f on a compact (or more generally 
bounded) interval J, is fundamental; it allows one, in many situations, to 
argue without recourse to the explicit construction of the integral expounded 
in n° 1 and 2. Here is an immediate consequence: 


Theorem 4. Let (f,,) be a uniformly convergent sequence of integrable func- 
tions on a bounded interval I. Then the function f(a) = lim f,(x) is inte- 
grable and 


(4.2) mf) = [ tae = tim [ fa(2)ae =limm(fn)- 


For r > 0 given, and for every n, let us choose a step function y, such 
that m(|fn — Gn|) <7, and let N be an integer such that 


n>N => |[f- fall <r 
from the definition of uniform convergence. For n > N we have 
mf gal) < mM(UE = fal) +m" Un — eal) 
< m(D)r+r, 


whence the integrability of {. Now (4.2) follows from the fact that 


Im(f) — mfr] S mF — fal) S MDF - Faller < mr, 


qed. 

Proper integration theory will allow us to prove a much stronger result 
than the preceding: one can replace uniform convergence by simple conver- 
gence (and even much less) on condition that one assumes that there is an 
integrable function g > 0 such that 


|fn(x)| < g(x) 


for every n and every x (Appendix, L19). The limit function f, though inte- 
grable in the modern sense of the term, need not be so in the archaic sense 
expounded here, even if the f, and g are. Nevertheless this can happen, in 
which case we have a result for Riemann integrals: 


(Dominated convergence). Let (f,) be a sequence of functions defined 
and integrable on an interval I; assume that (i) the fn converge simply to an 
integrable function f; (ii) there exists an integrable function g such that 


lfn(x)| <|g(a)| for every n and every «x € I. 


Then 
m(f) = limm(fn). 
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Since we cannot prove this very handy result yet, simple in appearance 
though it is — it is the analogue for “continuous sums” of the theorems on 
passage to the limit for sequences of normally convergent series, Chap. III, 
n° 13 and Chap. IV, n° 12 — , we shall not use it, except, sometimes, to 
show how it would greatly simplify those “elementary” proofs that require 
recourse to uniform convergence. The necessity of a hypothesis such as (ii) 
is quite clear from Figure 2: the functions f, converge simply to 0 but their 
integrals are all equal to 1. 


Fig. 2. 


Theorem 4 is nevertheless prodigiously useful as we shall see immediately 
and in the following n°. In particular, it applies to a uniformly convergent, 
or a fortiori normally convergent, series }> uy (x) of integrable functions: the 
sum of such a series is again integrable and 


(4.3) m(>> Un) = S- m(uUn), 
Le. 


(4.3’) i; bE un (2)] de =)~ | Un(x)de, 


the series on the right hand side being convergent and even, in the case of 
normal convergence, absolutely convergent, since 


|m(un)| Sm) |[un|lz 
with >> ||un||z <-++oo by hypothesis. 
Example 1. Consider a power series 
(4.4) f(z) = Senet nt = 57 eo 2l 


which converges on a disc |z| < R of nonzero radius, and let us calculate 
the integral of f(a) over an interval [a,b] with -R <a<b< R. We know 
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that the series converges normally on every disc |z| < r < R, so on the 
interval considered: we can therefore integrate term-by-term. But we also 
know (Chap. II, n° 11) that 


b 
44 prtl q?rtt 

+9) [ee-- 
or, equivalently, that 

b 
(4.5’) i; al lde = pint — glint 4), 
Thus we find : 
(4.6) i, f(x)dx = F(b) — F(a) 
where 


F(z) = e cyzitth = S- Cy_ zi 


is the primitive power series of f in the sense of Chap. II, n° 19. Since we 
also know that the function F' is differentiable (in the complex sense on C 
so a fortiori in the real sense on R) and that f is its derivative, (6) is just a 
particular case of the “fundamental theorem” which we will establish later: 


b 
f= fo | f(a)dx = F(b) — F(a). 

It was this kind of calculation that led Newton and Mercator to the series 

(4.7) log(1+2)=2-—27/2+2°/3-... 


They knew (and we know: Chap. II, n° 11) that the left hand side is the 
integral of the function t + 1/(1 +t) over the interval [0,2] and they were 
aware of (5). The calculation is then obvious, particularly if one does not 
worry about Theorem 4 any more than they did. Conversely, if one first 
knew the formula (7), one might deduce that the integral of the function 
1/(1+ 2) between x = a and x = b is equal to log(1 + b) — log(1 + a), but 
this assumes that a and 0 lie in the interval of convergence of (7): a direct 
calculation, in Chap. II, n° 11, has already provided the result free from this 
restriction. The reader may amuse himself by applying (6) in the same way 
to the series representing e”, sin x, cos x, etc., since their primitive series can 
be calculated immediately; one obtains the formulae 


b b 
i) cos xz.dx = sinb — sina, if sin x.dxz = cosa — cosb, 
a a 


(4.8) i eda =(c?—e)/t (tEC, t£0), 
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b 
/ dx /(1 + x”) = arctan b — arctana, 


etc., which confirm the “fundamental theorem”. 


To conclude these trivialities on uniform limits we remark that the relation 


(4.9) lim f, = f => lim m(f,n) = m(f), 


valid in the framework of the uniform convergence, remains so under much 
less restricted hypotheses. 

If one replaces g by 1 in the Cauchy-Schwarz inequality one obtains the 
relation 


(4.10) m(f)| < m1)? || fll2- 


From this one deduces that 
(4.11) lim || fn — fll2 = 0 => lim m(fn) = m(f); 
the same result holds for the L? norms, p > 1, and for the norm || ||, since 


|m(fn) — MF) S [fr — fila 


in this case. In other words, to obtain (9) it is enough to assume that there 
exists a real number p > 1 such that the integral of the function | f,(«)—f(x)|? 
tends to 0. 

When lim||f, — fllp = 0, one says that f, converges to f in mean of 
order p (“in mean” for short if p = 1, “in quadratic mean” if p = 2). This is 
clearly the case if we have uniform convergence, but the converse is false since 
the value of an integral has no direct connection with that of the function 
at a given point or even on the neighbourhood of a point. If for example 
we take on I = [0,1] the functions f,(2) = n for 0 < x < 1/n?, = 0 if 
not, then we have m(f;,) = 1/n and convergence to 0. What is essential to 
ensure convergence in mean is that, for n large, the difference |f,(x) — f(2)| 
should not be > 10!°° except on intervals of total length much smaller than 
10-199, All electricians know this, particularly in the case p = 2, since, for 
example, to calculate the power dissipated by an electric current of variable 
intensity I(t) passing through a resistance during an interval of time [a, bd], 
one integrates the function I(t)? over it; “surges of current” have no influence 
on the result if they are concentrated on sufficiently small intervals of time. 
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5 — Application to Fourier series and to power series 


For a long time it was believed, and neophytes sometimes still believe, if they 
trust to low level books, that integrals serve to calculate areas, volumes, cen- 
tres of gravity, magnetic flux, etc. This is not false, it was even positively true 
in the XVII* century, but for ages they have served quite another purpose, 
namely: to do mathematics, in other words, to prove theorems. At the point 
where we now are in expounding the theory, we still know almost nothing. 
Yet nevertheless ... 

Consider an absolutely convergent Fourier series of period 1, i.e. of the 
form 


(5.1) f(x) = S- ane with > |an| < +00, 


where one sums over all n € Z and where the factor 27 has been introduced 
into the exponents to simplify the formulae a little. Note in passing that 
the Euler relation e*” = cosx + i.sinz allows us to write (1) in the more 
traditional form 


f(z) =ao+ ys bn Cos 27Nx + Cy Sin 2mnx 


n=1 


which is less convenient computationally. 
The first problem of the theory is to calculate the coefficients in (1) from f. 
To this purpose we remark that, for any p,q € Z and a € R, we have” 


a+1 a+1 A 
QniproIniga > — 2ni(p—geq, _J 1 if p=q 
(5.2) | e€ e€ dx ‘ e€ dx { ie es 


If p = q we are integrating the constant function 1. If p 4 q, we can put 
t = 2ni(p — q) and apply (4.8); we find the variation between x = a and 
x =a-+1 of the function e“/t; since ¢ is a multiple of 277 this function is 
of period 1, so takes the same values at a and a+ 1 — there is no point in 
calculating them explicitly, except to increase the chance of error — so that 
the integral in question is zero. If, to simplify the notation, one puts 


a 


(5.3) e,(x) = €?7""* = exp(27inz) 


and if one uses the notation 


a+l1 
(5.4) (fla) =f Fle)ale)ae = m( 9) 
of the end of n° 2 to denote the scalar product of two functions f and g of 
period 1, then the preceding formulae can be written 


” The integral over an interval [a,a+1] depends only on the integrand if the latter 
is of period 1, as we saw at the end of n° 2. 
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; _f 1 if p=¢@. 
(522) @led=4 5 # ba" 
With this notation the series (1) can be written 


= S- An€n (x) 


For p € Z given, let us consider the scalar product 


(f |ep) = m(fep) = m(fe_p). 


Since > |an| < +00, and since the exponentials are all of modulus 1 because 
the exponents are purely imaginary, the series 


= SS Anen (x)ep(x) 


converges normally, so can be integrated term-by-term; in view of (2’) the 
only term which yields a nonzero result in the integration is that for which 
n = p, so finally we find the relation 


a+l1 : 
(5.5) ay = (Fle) =f flee Pan, 


This formula is the basis of the theory of Fourier series: one starts from a 

given periodic function f(x), uses (5) to define the coefficients a, and hopes 

that the function f is represented by the series (1). This heavenly vision of 

the theory does not, alas, correspond to reality once one leaves the domain of 

periodic functions of class C'. To begin with, the series S> a, may well not 

be absolutely convergent: the case of the square waves of Chap. III, n° 2. 
Let us now consider two absolutely convergent Fourier series 


t)= So anen(z), g(t) = S> bnen (2) 


and calculate their scalar product. The multiplication theorem for absolutely 
convergent series shows that 


g(x) = S> aybg ep (x =. Goby Cp—q( 


on using the relations 


e,(x)eg(x) = ep+q(2), en (2) =e_,(x) = e,(—2). 


The double series converges unconditionally and normally since by hypothesis 
the series }* ap, and >> bp, converge absolutely. We can then integrate term- 
by-term over [0, 1], whence 


(flg)= So apba (ep | eq); 
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the terms for which p 4 q disappear and the Parseval-Bessel formula 


a+l1 — — 
(5.6) (fla) =f Fle)ale)ae = andy 


remains. In particular, for any a € R, 


a+l1 


(5.7) Wwik=clp=f f(x) Pde = SV Jan). 


These proofs do not apply to the square wave series of Chap. III, n° 2 
and n° 11, but one can always examine what the results might mean in this 
case. To reduce to a Fourier series of period 1, one has to replace x by 27x 
in the series cos x — cos 32/3 + cos5a/5 —..., i.e. to consider the series 


f(x) cos 27x — cos(67x) /3 + cos(107a)/5—... 
(5.8) = [ei(x) + e-1(x)] /2 — [es(x)/3 + e-a(#)/3] /2 +... 


I 


whose sum, if one believes Fourier, is given by 
(5.9) f(a) = 7/4 for |z| << 1/4, =—7/4 for 1/4 < |a| < 3/4, 


and by periodicity for the other values of x. If one accepts (9), the formula 
(5) with a = —1/4 here gives, up to a factor 7/4 and using (4.8), 


1/4 ; 3/4 : 
A = i E27 PL dy _ / E27 PX dp — 
—1/4 1/4 


e7Tip/2 _ et ip/2 e—3rip/2 = en tip/2 


—27ip —27ip 
= (ee = etiv/?) /2rip _ eo TP (et? _ ee) /2rip = 
= [1 —(-1)?] sin(px/2)/np, 


zero if p is even, and equal to 2(—1)'°-))/? /zp if p is odd; since we omitted 
a factor 7/4, we finally have 


Gp =0 (p even) or (—1)"-/?/2n (p oda), 


which agrees with (8). Thus here 


Ye lanl? = 5 (1+ 1/3? + 1/87 +...) 


5 One might be tempted to write this series in the form S>(—1)"~?/?en (a) /2n 
where one sums over all odd n € Z, but this unordered sum is no more convergent 
than the harmonic series; only on grouping the symmetric terms do we obtain a 
convergent Fourier series. 
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since each term is repeated twice. To apply (7), we again have to calculate 
the integral, which is immediate since | f(a)|? = 77/16 for every x. Hence the 
formula 


(5.10) 141/37 +1/5? +...=77/8. 


Since one knows that 
m/6= So 1/n? = $0 1/(2n)? + $0 1/(Qn - 1)? = 0? /24+4+ $0 1/(2n-1), 


it remains to observe that 1/6 — 1/24 = 1/8 to confirm that the result is 
indeed correct, even if the argument is unsupported for the moment; this 
indicates that the hypothesis of absolute convergence in (5), (6) or (7), is 
probably too restricting. And this is what the theory of Fourier series will 
show. 


Now let 


(5.11) Oa a? 


be a power series which converges on a disc |z| < R and therefore normally 
on every disc |z| < r < R. Consider the function f on the circle of radius r; 
again putting e?" = e(t) with ¢ real one finds 


(5.12) flre(t)] = S— anr"en(t), 


an absolutely convergent Fourier series having exponentials only of index 
n > 0. Therefore, by (5), 


if n<0. 


In particular, for n = 0, 


(5.14) i fire at ae FO); 


which means that the “mean value” of f on the circle |z| = r is equal to its 
value at the centre of the circle, a curious property of the analytic functions. 
But there is better: since (13) allows us to calculate the coefficients a, from 
the values of f on the circle of radius r it must be possible to calculate f(z), 
and not just f(0), in the same way. 

Calculating formally for the moment, i.e. interchanging the f[ and >> 
signs, applying (13) for a given r < R and substituting in (11) for a z such 
that |z| <r, we obtain 
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since z and r do not depend on the variable of integration t. To justify 
this calculation, i.e. Cauchy’s integral formula, which we shall write in an- 
other way below, it suffices to show that we are integrating a normally con- 
vergent series over [0,1], see (4.3). The factor f[re(t)], bounded because 
it is a continuous function of t, is not important. The geometric series 
\(z/re(t))” must converge, which forces |z| < r. If this is the case, the 
formula |[z/re(t)]"| = (|z|/r)" = q” with g = |z|/r < 1 implies the normal 
convergence of the series that we are integrating, qed. 

Formula (15) shows that, on the disc |z| < r < R, we can calculate f 
from its values on the circumference |z| = r using an explicit formula of the 
simplest kind. One normally states it in terms of re(t) = ¢, a function of t 
whose differential is 

d¢ = 2nire(t)dt; 


then Cauchy’s formula is written, a la Leibniz, in the form 


(5.16 fle) = = [ OS 


where one integrates along the circumference |¢| = r and where |z| < r. This 
is, as we shall see later, a “curvilinear integral” (Chap. VIII, n°s 2 and 4). 

Conversely, any function f that is continuous for |z| < r and satisfies (16) 
for |z| <r is a power series, i.e. is analytic in |z| < r : compute as in (15), but 
in the opposite order. This will later be used to prove that a uniform limit of 
analytic functions is analytic (Chap. VII, n° 19, where a more precise result 
will be found). 
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§ 2. Integrability Conditions 


6 — The Borel-Lebesgue Theorem 


As we have seen in Chap. II, n° 11, a very simple sufficient condition for the 
integrability of a real function f is the existence for every r > 0 of a step 
function y such that 


(6.1) f(x) — p(a)| <r for every « € I; 


for then p—r < f < y+r and since the integrals of py — r and y+ r are 
equal to within 2rm(J) the relation m.,(f) = m*(f) follows. 

The preceding property means that f is the uniform limit of step func- 
tions, so that the integrability of f would also follow from Theorem 4. The 
functions possessing this property, i.e. the regulated functions of Chap. II, 
n° 11, have (Chap. III, n° 12) both left and right limits at every point of J. 
In this n° and the following, we shall show that this property characterises 
them, if I is compact. 

The idea of the proof is very simple: the whole problem is to show that, for 
every r > 0, one can decompose I into a finite number of subintervals I, such 
that the given function f is constant to within r on each I. This condition 
is clearly necessary if the condition (1) is to be satisfied; if, conversely, it is 
satisfied, and if one assumes, as one may, that the J, are pairwise disjoint, 
one obtains (1) on taking y to be equal to f(&,) on Iz, where & is a point 
chosen arbitrarily in J,. 

Now, given a function f that has right and left limits at every x € J, it is 
very easy to construct such intervals. Since, for every a € I, the limits f(a+) 
and f(a—) exist, there is an open interval ja,a + r’[{ with left end-point a 
and an open interval Ja — r”, a[ with right end-point a on which the function 
is constant to within? r. And of course it is constant to within r on the 
interval [a, a]. If one then considers the open interval U(a) = Ja—r”,a+1’| 
one obtains the following results: (i) each U(a) is the union of at most three 
intervals on each of which f is constant to within r; (ii) U(a) contains a for 
every a € I. The theorem at which we aim would therefore be established if 
we could find a finite number of points a, such that 


Ic(JU(a) 


since then I would be the union of its intersections with the U(a;,), which 
are composed of at most three intervals on which f is constant to within r. 


° If a is the right (resp. left) end-point of J, one may take any number > 0 for 
r’ (resp. r”’). If the function f is continuous, one can even find an open interval 
with centre a on which f is constant to within r, but this detail does not simplify 
the following proof. 
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This kind of question certainly set mathematicians a-thinking from about 
1850 onwards, at least those who were concerned about the foundations of 
analysis and, in particular, about the properties of continuous functions. 
In their research on the “grand” integration theory, Emile Borel and Henri 
Lebesgue came to isolate the crucial point which their predecessors (see The- 
orem 8 below) had more or less used, without appreciating the generality of 
the statement; it was later extended, like the Bolzano-Weierstrass theorem, 
to much more general spaces than R or C where the notion of a compact set 
has meaning (see for example Dieudonné, Vol. I, Chap. III, n° 16). 


Theorem 5 (Borel-Lebesgue). Let K be a compact subset of R (resp. C) 
and (U;)ier a family of open sets in R (resp. C). Suppose that K is contained 
in the union of the U;. Then there is a finite subset F of the set of indices 
I such that K is contained in the union of the U;, 1 € F. This property 
characterises the compact subsets of R (resp. C). 


First we show that if K is bounded one can, for every r > 0, find a finite 
number of points x, of K such that K is contained in the union of the open 
balls B(ax;,,17). Since K is certainly contained in a compact interval or square, 
it is clear that one can find a finite number of open balls of radius r/2 which 
cover K, i.e. whose union contains AK. Let us choose an x, € K in each of 
those of these balls B;, which actually intersect Kk. Since By, is of radius r/2, 
so of diameter r, we have B, C B(xx,1r), so that the B(x,,7r) cover K as 
desired. 

To prove the existence of F,, it thus suffices to show that there exists a 
number r > 0 possessing the following property: 


(*) for every « € K the open ball B(z,r) is contained in one of 
the Uj. 


If this is so, then it is enough to choose a U; containing B(x,,1r) for each k 
to obtain the first assertion of the theorem. 

Suppose (*) is false. For every n € N there then must exist an a(n) © K 
such that the ball B(a(n),1/n) is not contained in any of the U;. By BW, 
since K is compact, one can extract from the sequence 2(n) a subsequence 
x(Pn) which converges to an a € K (Chap. III, n° 9). Since one of the U; 
contains a, and is open, it contains a ball B(a,r) of radius r > 0. For n large 
one has both |a — x(pn)| < r/2 and 1/p, < r/2. It follows that the ball 
B(2(pn),1/pn) is contained in B(a,r) and a fortiori in U;, a contradiction. 

It remains to show that the compact sets are the only ones to have the 
BL property. In the first place, a set K which has it is bounded; indeed, K 
is covered by the family of the open balls B(a,1), « € K, since any ball 
contains its centre; there is then a finite number of x, € K such that the 
B(ax,1) cover K, whence this property. 

On the other hand, K is closed, i.e. contains every adherent point a. Let 
us assume the opposite and let a ¢ K be an adherent point of K. For every 
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n € N, denote by U;, the set of « € R (or C) such that d(x,a) > 1/n, ice. 
the exterior of the ball B(a,1/n). The U,, are open and cover K: for every 
x € K one has d(x,a) > 0 since a ¢ K, whence, for n large, d(x,a) > 1/n, 
i.e.  € Uy. If then K has the Borel-Lebesgue property one can cover it by 
a finite number of sets U,,; but since these form an increasing sequence this 
means that K C U,, for n large, in other words that the closed ball B(a, 1/n) 
complementing U, in R (or C) does not meet AK. Contradiction, since a is 
adherent to K, qed. 

By a curious coincidence, the essential tool in this proof is the Bolzano- 
Weierstrass theorem, which, as we know (Chap. III, n° 9), characterises the 
compact subsets of R or C. We may therefore wonder whether, conversely, it 
is possible to deduce BW from BL, which would allow the reader to add BL 
to the list of the statements equivalent to the axiom (IV) of R (Chap. III, end 
of n° 10). For a proof, see Dieudonné, Eléments d’analyse, Vol. I, Chap. II, 
n° 16. 


Corollary 1. Let (K;)icer be a family of nonempty compact sets in R or C. 
Suppose that the intersection of the K; is empty. Then there is a finite subset 
F of I such that the intersection of the K;, i © F, is empty. 


We choose any index j and replace each kK; by A; K;. If one of these 
intersections is empty, the corollary is proved. So assume they are nonempty. 
This is equivalent to assuming that all the K; are contained in the same 
compact set K, namely K;. 

Let U; be the complement of K; in R (or C). It is open since K; is closed. 
The union of the U; is the complement of the intersection of the K;. If this 
is itself empty then the U; cover R (or C) and thus Kk. By BL, there exists 
a finite set F C I such that the U;, i © F, cover kK. The complement of the 
union of these U; is the intersection of the K;, i € F. This cannot intersect 
kK; and since it is contained in K it must be empty, qed. 

If we put 

Kp=()Ki 
ieF 
for every finite subset F' of I we can reformulate the preceding corollary 
as follows: the K; have a point in common if and only if Kp is nonempty 
irrespective of F’. The case where the K; are intervals in R has already been 
treated in Chap. III, n° 9. 

The reader will perhaps wonder why it is necessary to assume the U; open 
in the BL theorem. A trivial counterexample: cover K by the closed sets {x}, 
x € K; if K is infinite it is clearly impossible to cover it by a finite number 
of such sets. One might prefer a less crude counterexample. Take K = [—1,1] 
and cover it by the intervals ]1/2, 1], ]1/3,1/2],... and [—2,0]. Every x > Oin 
K belongs to one and only one interval |1/n,1/(n+1)], and every x < 0 to the 
interval [—2, 0]; the obstacle would fall if one had chosen [—2,r] with an r > 0. 
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Another important consequence of BL is the local character of uniform 
convergence on a compact set: 


Corollary 2. Let X be a subset of C and (fn) a sequence of scalar functions 
defined on X, converging simply to a limit f. Assume that, for everya Ee X, 
there exists a ball B(a) of centre a such that the fy, converge to f uniformly 
on B(a)M X. Then the fn converge uniformly on every compact K Cc X 
(“compact convergence” on X ). 


We may assume the B(a) open. By BL, one can cover K by a finite 
number of balls B(a;). For r > 0 given, the assertion 


(6.2) lfn(z) — f(x)| <r for every x € Bla) NK 


is, for each 7, true for n large. Since, for r given, these relations are finite 
in number, they are thus simultaneously true for n large (Chap. II, n° 3), 
and since the union of the B(a;) K is K, it follows that, for n large, the 
inequality (2) is true for all the x € K simultaneously, qed. 

Corollary 2 is particularly useful in the theory of analytic functions; X is 
then an open subset of C and it is often easy to show that, for every ac X, 
the convergence of the f, is uniform on a sufficiently small disc with centre 
a, whence compact convergence on X. 


7 — Integrability of regulated or continuous functions 


The arguments which led us to formulate the BL theorem at the beginning 
of the preceding n° lead to the following result: 


Theorem 6. Let f be a scalar function defined on an interval I of R. The 
two following properties are equivalent: (i) f has left and right limits at every 
point of I; (tt) there is a sequence of step functions on I which converges to 
f uniformly on every compact subset of I. The function f is then continuous 
on the complement of a countable subset of I. 


The implication (ii) ==> (i) was established from Cauchy’s criterion in 
Chap. ITI, n° 12 (Corollary of Theorem 16). The implication (i) ==> (ii) 
is obtained, when J is compact, by observing, as at the beginning of the 
preceding n°, that for every r > 0, there exists for every x € J an open 
interval U(x) =|a—r”, «+r’| such that f is constant to within r on each of 
the three intervals Ja —r”, x[, [z,x] and ]x,x+1’[; it remains only to apply 
BL to the U(x) to obtain a finite number of intervals covering J and on each 
of which f is constant to within r; this argument also shows that f is bounded 
on every compact K Cc I. 

In the case of a not necessarily compact interval J one clearly has to work 
on an arbitrary compact interval kK contained in J. One sees then that the 
following two properties are equivalent for a scalar function f defined on I: 
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(i) f has right and left limits at every point of J, in other words, by definition, 
is regulated on I; 

(ii) for every compact interval K C I there exists a sequence of step functions 
on K which converges to f uniformly on K. 


One can then find a sequence (y,,) of step functions on I (i.e. such that one 
can partition J into a finite number of intervals on each of which the function 
is constant) which, for every compact K C I, converges to f uniformly on K: 
choose an increasing sequence of compact intervals K,, with union J and, for 
each n, a step function y, on Ky, satisfying | f(x) — yn(x)| < 1/n for every 
x € Ky, and then define y, on all of I by agreeing that y,,(x) = 0 for every 
xeéI—K,,. Again limy,(x) = f(a) for every x € I because x € K, for p 
large, whence | f(x) — y,(x)| < 1/n for every x € K, and every n > p since 
then Kp C Kp. 

It remains to prove the continuity of f. For every n, let D, be the finite 
set of the points of J where vy, is discontinuous. The union D of the D,, is 
countable (Chap. I) and the yy, as functions defined on J, are all continuous 
at every x € I — D. Similarly!® for f, qed. 

Note that the theorem applies to monotone functions in particular. 


Corollary. Every bounded and regulated function f on a bounded interval 
I = (a,) is integrable on I, and 


b v 

(7.1) / f(a)dx = , im ; / f(a)da. 

Choose an r > 0 and a compact interval kK = [u, v] contained in J and such 
that m(1)—m(K) < r. By Theorem 6 the function f is integrable on K. There 
must therefore exist on K a step function y such that mx (|f—yl|) <r, where 
mx is the integral over K. We define a step function y’ on I by requiring it 
to be equal to y on K and zero off K. Since | f(x) — y’(x)| < ||f||, off K, we 
see, separating the contributions over K and I — K, that 


[is — 9 (2)|de <r t[mZ) — mK) Nf ll; < 1+ I fll) 


whence f is integrable on J. Relation (1) follows on remarking that the dif- 
ference between the integrals over J and [u, v] is the sum of the integrals over 
Ja, u] and [v, b[, intervals whose lengths tend to 0, qed. 

We will rediscover this in § 7 @ propos the integration of not necessarily 
bounded functions over arbitrary intervals. Up to then integrals over a com- 
pact interval will almost always be sufficient for our needs, but it is good 
to know that in spite of its unorthodox behaviour on a neighbourhood of 0, 
the function sin(1/z) is integrable over ]0,1] in the sense of n° 2, the most 
elementary that there is. 


10 To avoid all confusion, recall that we are dealing with the continuity of f as a 
function on J and not only on IJ — D. See n° 5 of Chap. III again. 
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The arguments showing that (i) ==> (ii) also serve to establish the fol- 
lowing result, already mentioned in n° 3: 


Theorem 7. The integral of a regulated (resp. continuous) positive function 
f is zero if and only if the set D = { f(a) 4 0} is countable (resp. empty). 


The condition is sufficient. For consider a step function y < f. One can 
have y(x) > 0 only if « € D. Since the set of points of a nonsingleton 
interval is uncountable (Chap. I), the function y is necessarily negative on 
all the intervals of nonzero length where it is constant. Thus m(y) < 0 and, 
since m(f) is the upper bound of these m(w), we also have m(f) < 0, whence 
m(f) = 0 since f is positive. 

To show that it is necessary, we assume first that I is compact and con- 
struct a sequence (Y,) of step functions on I such that ||f — ynllr < 1/n. 
Replacing yn by Y, — 1/n, we may assume that ||f — gril; < (1 + m(Z))/n 
and yn < f. Since f > 0 one can even assume y,, > 0 (replace them by 
the vy). Then m(yn) = 0 since m(f) = 0. Each y, is then zero outside 
a finite set D,. The union D of the D, is countable (Chap. I) and since 
f(x) =limy,(«) for every x € I it is clear that f(x) = 0 for every x ¢ D. 

If now I is not compact it is the union of a sequence of compact K,,. The 
integral of f over each K,, is clearly zero; the DN Ky, are therefore countable, 
so D=JDN K,, (Ch. I) is too. 

If f is continuous then D is open and so, if not empty, must contain an 
interval of length > 0, which would have to be countable like D, contrary to 
Cantor’s most famous theorem, qed. 

A corollary of Theorem 7 is that if two regulated functions f and g are 
equal outside a countable set D then m(f) = m(g). For the function | f —g| is 
again regulated!! and it is positive; Theorem 7 then shows that m(|f—g|) = 0, 
whence m(f) = m(g). 

One might be tempted to believe that conversely, if one modifies the values 
of a regulated function f on a countable set D of points, one will again find 
an integrable or even regulated function. False: the constant function equal 
to 1 is as regulated as it is possible to be, but if you change it to have the 
value 0 at rational points you will obtain the Dirichlet function which is 
neither regulated nor Riemann integrable. 


8 — Uniform continuity and its consequences 


The principal interest of Theorem 6 is to show that every regulated function 
is integrable. In particular this is the case for continuous functions. The proof 


" Obvious. Note, in this circle of ideas, that if f is regulated and if g is continuous 
then the composite function go f is again regulated, since if x tends to c+ or 
c—, then f(x) tends to f(c+) or f(c—), so that g[f(x)] tends to a limit, namely 
g|f(c+)] or g[f(c—)], qed. This result may not follow if g is only regulated. 
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of the implication (i) => (ii) of Theorem 6 allows us to isolate an important 
property that they have, namely uniform continuity. 

Consider, generally, a scalar function f defined and continuous on a subset 
X of R or C. For every r > 0 and every x € X there exists a number r’ > 0 
such that, for y € X, 


(8.1) (x,y) Sr = d[f(x), FY] Sr. 


The number r’ depends a priori on the choice of r and of x. One says that 
f is uniformly continuous on X if, for every r > 0, you can choose the same 
r’ >0 for all x € X, so that 

(8.2) {(@e X) & (ye X) & (d(z,y) <1’)} => f(a), f(y)] <r. 
Suppose for example that X = R and let us put y = « — h. Then (2) means 
that 


(8.3) |h| <r’ = |f(e@—h)— f(x)| <r for every c ER. 


Now let us introduce the translated functions 
(8.4) fa(z) = f(x —h) 


of f whose graphs are derived from the graph of f by horizontal translations. 
This said, the fact that d[f;,(x), f(a)] <r for every x means simply, in the 
notation of Chap. III, n° 7, that 


(8.5) dr(f, fr) = \lf — fall; <7. 


The existence, for every r > 0, of an r’ > 0 satisfying (3) thus means that as 
h tends to 0 the function f),(a) converges to f(x) uniformly on R. One would 
like to formulate uniform continuity on an arbitrary set X in a similar way, 
but in this case the function f},(z) is defined only on the set 4 X formed from 
X by the horizontal translation of amplitude h, and convergence, uniform or 
not, no longer has a meaning. 

Uniform continuity is very far from being a universal property of contin- 
uous functions. If you take the function f(x) = e” on R for example, when 
f(z) = e7" f(x), it is clear that, as h tends to 0, f, converges simply to f — 
this is continuity —, but for a given h the difference | f(a)—fp,(x)| = |e~’—1|e” 
is not even bounded on R, which rules out uniform convergence: in this case 
lf — frlle = +00 for any h £0. 

We always have: 


Theorem 8 (Heine’?). Every scalar function defined and continuous on a 
compact set K CC is uniformly continuous on K. 
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Given r > 0 let us choose for each x € K an open ball B(x) with centre x 
such that f is constant to within r in B(x) NK. Let B’(x) be the open ball 
with centre x and of radius half that of B(x). Since the B’(x) cover K, one 
can, by BL, find points 21,...,2, of K such that the balls B’(x;) cover K. 
Let r’ > 0 be the smallest of their radii, and let x,y be two points of K such 
that d(x,y) < r’. The point 2 belongs to one of the balls B’(x;). Since the 
radius of B(2;) is twice that of B’(x;), itself > r’, the ball B(«;) contains y 
too, by the triangle inequality. Since f is constant to within r on B(a2;) we 
have | f(x) — f(y)| <7, qed. 


Corollary 1. Let f be a scalar function defined and continuous on R (resp. 
C) and zero for |x| large. Then f is uniformly continuous on R (resp. C). 


We need only treat the case of C. Let kK be a compact set outside which 
f = 0, and H the set of x € C such that d(z,K) < 1. Since d(z, K) is a 
continuous function of « (Chap. III, n° 10), the set H is closed. It is clearly 
bounded like K, so is compact. For every r > 0 there is thus an r’ > 0 such 
that, for ,y € H, the relation d(x,y) < r’ implies d[f(a), f(y)] < r. We 
may assume r’ < 1. Now let x,y be two points of C such that d(x,y) < 1’. 
If both are in H, the question is settled. If « ¢ H we have d(x, K) > d(x, y), 
so y ¢ K, whence f(x) = f(y) = 0, qed. 


It is easy to understand why Theorem 8 does not apply to noncompact 
sets. Consider such a set X and a uniformly continuous function f on it; and 
let a be an adherent point of X; then f tends to a limit when x € X tends 
to a. For, take an r > 0; in view of Cauchy’s criterion (Chap. III, n° 10, 
Theorem 13’), we need to prove the existence of an r’ > 0 such that, for 
ryEeXxX, 


{(|2— al <r’) & (ly—al <r’)} => |f(@) - f@)| <r. 


But since f is uniformly continuous there is an r’”” > 0 such that the right 
hand inequality holds for |a — y| <r”; it then suffices to take r’ = r”/2. 


2 Heine published in 1872, but Dugac tells us that Weierstrass had already taught 
the theorem in 1865, that Riemann and Dirichlet had actually used it without 
proof by 1854, and that it had been used implicitly by Cauchy, who had not 
perceived the difficulty (Chap. III, n° 6). 
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In these circumstances it is natural to define a function F on the closure!® 


X of X by putting 
F(a) = ili 
Oe saeeeee 
for every a € X; we have F(a) = f(a) if a € X. Let us show that the function 
F is continuous on X. For every r > 0 we choose an r’ > 0 such that, for 
ryEeXx, 
ln —yl <r’ = [F(@) — Fy) <r 


and consider two points a,b of X such that |a — b| <r’ (strict inequality). If 
x,y € X are sufficiently close to a and b respectively, we again have |a—y| <r’ 
and so |f(«) — f(y)| <r; since f(a) and f(y) tend to F(a) and F(b), we find 
in the limit that |F'(a) — F(b)| <r, whence the result. 

This shows that the notion of uniform convergence in reality concerns only 
continuous functions on a closed set, or, equivalently, which can be extended 
to a closed set while remaining continuous (and even uniformly continuous). 
In particular: 


Corollary 2. Let f be a function defined and continuous on a bounded set 
X CC. The following two properties are equivalent: (i) f is uniformly con- 
tinuous on X; (ii) f is the restriction to X of a continuous function on the 
compact set X:. 


We have just seen that (i) implies (ii). The converse implication follows 
from Theorem 8 since X is compact. 

If, for example, X =]0, 1], the function f(x) = sin(1/x) manifestly has no 
limit when x tends to 0; this does not prevent it from being integrable since 
it is continuous and bounded (Corollary to Theorem 6), but does prevent 
it from being uniformly continuous on X. To verify this by a subtle use of 
inequalities is a gymnastic exercise in the Weierstrass tradition; Corollary 2 
makes this quite unnecessary: there are enough serious occasions for dealing 
with inequalities that one prefers not to when one can obtain the result free. 
One might, otherwise, advise the amateurs to examine such functions as 


sin(sin(1/a)), sin(exp(sin(1/a))), ete. 


“py hand”. 


Corollary 2 allows us to answer an approximation problem: can one ap- 
proximate a given continuous function on X by polynomials uniformly on 
X? We shall show in n° 28 that this is so if X is a compact interval in R (or 
C so long as one uses polynomials in x and y, and not in z= 2+ iy). But if 
X is bounded without being compact? 

Let p be a polynomial satisfying | f(x) — p(x)| < r for every x € X. Since 
the function p is continuous on R and so on the compact closure of X, it 


13 Recall that this is the set of points that one can approximate by the x € X, or, 
again, the smallest closed set containing X. 
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is uniformly continuous on X. There is therefore an r’ > 0 such that, for 
x,y € X, the relation |x — y| < r’ implies |p(x) — p(y)| < r and consequently 
| f(x) — f(y)| < 3r. In other words, if f is the uniform limit of polynomi- 
als (or, more generally, of uniformly continuous functions on X), then f is 
uniformly continuous on X. Conversely, f may then be extended to a con- 
tinuous function on the compact set X and Weierstrass’ theorem provides 
the desired approximation on X, so a fortiori on X. The question thus lacks 
interest: when the answer is affirmative it results from Weierstrass’ theorem 
for a compact set. On the other hand we have shown in Chap. III, n° 5 that 
if X is an unbounded interval in R then the only uniform limits of polyno- 
mials in X are the polynomials themselves. Moral: do not try to “improve” 
Weierstrass’ theorem ... 


Another consequence of Heine’s theorem is the possibility of defining the 
integral of a continuous function f over a compact interval I by means of the 
standard Riemann sums. 

One can, for example, like Cauchy, consider arbitrary subdivisions of I 
and the sums > f(a%)(@%+1 — %%) which irresistibly evoke Leibniz’ notation 
J f(x)dax (the evocation, as concerns Cauchy, would rather go in the inverse 
sense ...) or even the more general sums )> f(%)(@e41 — Xx) with the points 
€, chosen arbitrarily in the closed intervals! [x,, r%+1]. If, on each of these 
intervals, the function is constant to within r, the function f is everywhere 
equal to within r to the step function equal to f(&,) on [x,, p41], so that the 
integral of f is equal to the sum considered to within m(I)r. But since f is 
uniformly continuous this condition will be satisfied so long as |a441—2%| <1’ 
for a suitably chosen r’ > 0. In other words: 


Corollary 3. Let f be a scalar function defined and continuous on a compact 
interval I. For every r > 0 there exists an r’ > 0 such that 


[ f@ae-Y HENenn— ay] <7 


for any points & © [xp, 2x41] 80 long as the subdivision (x,) of I satisfies 
|tp41 — &x| <1’ for every k. 


(8.6) 


For example one can decompose J into n equal intervals [,,...,Z, and 
choose a €, € J; at random for each k. The corresponding Riemann sum is 


just 

n 
It tends to the integral of f as n increases indefinitely. This remark explains 
why the ratio m(f)/m(I) between the integral of f and the measure of I is 
called the mean value of the function f on J. 


4 For a general regulated function we have seen above that the €, must be interior 
to the intervals of the subdivision because the function f may be discontinuous 
at the points xx. 
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9 — Differentiation and integration under the [ sign 


We shall continue to explain various important consequences of uniform conti- 
nuity. We have established them not only in R, but also in C, i.e. for functions 
of two real variables. 

Consider a function f(x,y) defined and continuous on a rectangle K x J 
in C, where K and J are intervals in R, and K is assumed compact as its 
name suggests. We can integrate f(x,y) with respect to x for given y, and 
more generally consider the function 


(9.1) vl) = I f(a, 9) ula) de 


where yz is an arbitrary integrable function on K (if not a Radon measure 


ali 


Theorem 9. Let K be a compact interval, J an arbitrary interval of R, and 
let f(a,y) be a continuous function on K x J. Then 


(i) the function (1) is continuous in J; 
(ii) if f has a continuous partial derivative Do f(x,y) on K x J then vy is of 
class C' on J and 


(9.2) vy) = I Do f(a, y)ule)de. 


Continuity and differentiability at a point y being local properties we can 
replace J in what follows by a compact interval H C J containing all points 
of J sufficiently close to y. 

Generally, put u(f) = { f(x)u(x)dz for every function f continuous on K, 
whence, omitting the K under the [ sign, 


MA) S [i @llao)lae < M()Ilfll« 


where M(j1) = f |(x)|dx. Then y(y) = w(fy) where fy(x) = f(e,y). 
Since f is continuous and so uniformly continuous on the compact set 


K x H, one can associate with every r > 0 an r’ > 0 such that, on K x H, 
(9.3) (Iz -—2"| <r) & (yy <0!) = fey) — Fey) <r. 


For |y’/—y"| <r’ one then has | f(z, y’)—f(,y)| <r, ie. | fy (x) — fy (a)| <7, 
for every « € K; consequently, 


(9.4) ly o| < 0! => Why — fur lle Sr. 


This means that, as y” tends to y’, the function f, converges to fy uniformly 
on K. The continuity of y follows from this since 
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lo(y’) — oy") | = le fy) — ey) < Me) fy — fy lle < M(n)r 


for jy’ —y”"| <1’. 

As to differentiability, we put Daf = g, gy(x) = g(x,y) and denote the 
right hand side of (2) by ~(y) = (g,), a continuous function of y by (i) 
applied to g. Then 


L(fy-+n = fy)/h a Iy) 


by the linearity of f +> u(f). To show that the left hand side tends to 0 with 
h, it suffices to show that, as h tends to 0, the function of x to be integrated 
(in the third term) tends to 0 uniformly on K for y given. 

Now we proved in Chap. III, n° 16 (Corollary 4 of the Mean Value The- 
orem) that for every differentiable function p on a compact interval [a, b], we 
have 


(9.5) 


I 


|p(b) — p(a) — p'(c)(b— a)| < |b — a]. sup [p'(x) — p'(o)| 


for every c € [a,b], the sup being taken over the x € [a,b]. We apply this 
result to the function y+> f(x,y) for x given; we obtain 


Flay + h) = f(x,y) = Dof (x, y)hl < |h|. sup |Dof(z,y + k) _ Dof(x,y)|, 


the sup being taken over the k lying between 0 and h. The function Do f 
being continuous and so uniformly continuous on the compact set K x H, 
there exists for every r > 0 an r’ > 0 such that 


|k| <r! => |Dof(x,y +k) — Daf(x,y)| <r 
for any « € K and y € H. We deduce that 
|h| <r! ==> |f(x,y +h) — f(x,y) — Dof(x,y)h| < rll, 


ie. that 


|h] <r! => | fytn(@) — fy(x) — hgy(x)| < rlhl, 


for any x € K. On taking the sup for x € K and dividing by |h|, we deduce 
that 


(9.6) |h] Sr! => lI(fy+n — fy)/h = gyllie S7 


which proves uniform convergence as announced, or, if one prefers, shows 
that the left hand side of (5) is < M()r, qed. 


Let now K and H be two compact intervals, and v two integrable 
functions on K and HA and f a continuous function in kK x H. We can then 
consider the iterated integral which we denote by 
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fo“ y)dy I f(x,y)u(a)dx rather than Is ( I F(2.a)n(e)de) v(y)dy 


as, in principle, one should. One can also perform these operations in the 
opposite order. 


Theorem 10. Let K and H be two compact intervals in R and f a contin- 
uous function on K x H. Then 


(7) f wendy f tenute)ae =f waar f envwey 


for any integrable functions uw and v on K and H. 


This is the analogue of the theorem on absolutely convergent double series 
(Chap. II, n° 18). 

To prove the equality of the two sides of (7), note that, by (3), there exist 
finite partitions of K and # into intervals K, and Hy such that f is constant 
to within r on each rectangle K, x Hg. Then 


[ semana => | Fe yrondy 


and therefore 


(98) f wade f feyow)ay=S | Hla) i: Fe vrtody 


Now let us choose points €& € K, and n, € Hy. If we replace f(x,y) by 
f (&;%q) in the general term, the error is clearly bounded by 


(9.9) , i. uC) [de a (w)lay 


Pp q 


The left hand side of (8) is thus equal to the “double Riemann sum” 


(9.10) 32 FE.) a pa) de [ v(y)dy = S> F (Esq) Kp) YH) 


(obvious notation!), with an error less than the sum of the expressions (9), 


so less than 
rf ute yiae f otw) )Idy = M(u)M(0)r, 


the product of the integrals of |j| and |v| over K and H. One would find the 
same result on calculating in the same way from the right hand side of (7). 
Since r > 0 is arbitrary, they must be equal, qed. 

The preceding theorem justifies the definition 
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ott) ff Havuley)aedy = [ular f fe.yv)ay = 


KxH 
= fay f fo yute)ae 


of the double integrals taken over a compact rectangle kK x H with respect 
to the “product measure” pu(x)v(y)dady. On might define them in a more 
general framework on replacing K x H by a not too barbarous bounded 
subset of C, or, equivalently, extend the theorem to discontinuous functions, 
but one runs quickly into great difficulties if one remains in the framework 
of the Riemann integral. 

Consider for example the following very trivial problem: one takes a pos- 
itive continuous function y(a2) on K with values in H and seeks to calculate 
the area A contained between the x axis and the curve y = y() by means of 
a double integral rather than by the usual simple integral. Writing E Cc R? 
for the set of (x,y) such that x € K and 0 < y < y(a) and yg for its char- 
acteristic function, equal to 1 on E and to 0 elsewhere, it is “geometrically 
obvious” that 


(9.12) A= |f xe(e.wdeay; 


KXxH 


F(b) F(b) F(b) 
Fig. 4. 


moreover, if one calculates the double integral by { da f dy, the integral with 
respect to y, for x given, involves the function equal to 1 between 0 and y(z) 
and zero elsewhere, whence f dy = (x); on integrating with respect to x 
one thus finds the integral of the function y, which is precisely the area in 
question. But let us first integrate with respect to x. For y = b € H given, 
one has yz (a,b) = 1 if p(x) > b and = 0 if not; if one then considers the set 
F(b) C K of x € K such that v(x) > 6, one has to integrate with respect to 
«x the characteristic function of F'(b), a compact set since y is continuous and 
K compact. Now there is no reason why this function should be Riemann 
integrable. In fact, for every compact set F' C K, there exists a function y 
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such that F = F(1); it suffices for this that v(x) = 1 on F and (x) < 1 for 
x ¢ F. To prove the existence of y, consider the function 


d(x, F) = inf ja — ul; 


it is continuous, zero on F and strictly positive outside F' (Chap. III, n° 10, 
Example 1). Now let M be the maximum of d(x, F’) for « € K; on K the 
function 


is continuous, positive, equal to 1 on F and < 1 elsewhere. For this choice of 
y the set {y > 1} is just F’. Figure 5 gives no idea of the complexity of y in 
the general case. 

So we see that for it to be possible to invert the order of integration in a 
double integral in the Riemann theory so that 


J [ He-naeay = a xB(x,y) f(x, y)dady 
E 


KXxXH 


for every “reasonable”, for example compact or open, subset F, of the com- 
pact rectangle K x H, as “users” unquestioningly believe, it is necessary, for 
a start, that the characteristic function of every compact or open subset of R 
should be integrable in the sense of this chapter. If such had been the case, 
no one would ever have invented the Lebesgue integral, and certainly not he 
himself, since this is precisely the problem which led him to his theory. 

Of course, the objection does not arise for the “usual” functions: you can 
calculate the area of a semicircle centred on the x axis by integrating first 
with respect to z then with respect to y, since in this case the sets F'(b) are 
inoffensive intervals; for curves a little less convex or concave the F'(b) can 
be finite unions of closed intervals, which poses no greater problem, though 
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again one has to justify it. But the general case lies beyond the elementary 
theory; we shall treat it in n° 33. 

Finally we remark that all the results of this n° remain valid, with the 
same proofs (n° 30), when one integrates with respect to a general “Radon 
measure” i.e. when one replaces the integral f +> m(f) by a function 
f + uf) satisfying the properties of linearity and continuity of Theorem 1 
which, alone, feature in all that we have just done (except for calculating 
the “double Riemann sums”, when we have to modify a little to eliminate 
discontinuous functions). In other words, it is not the explicit construction 
of the integral which matters in these problems, but its formal properties. 
Mathematicians needed two hundred and fifty years to understand this, but 
we now have a century of experience. 


10 — Semicontinuous functions!° 

To know that the regulated functions are integrable is almost always enough 
in elementary practice, but it is not difficult, where we now are, to anticipate 
the “grand” integration theory. The essential tool is a famous theorem which 
would have been of great use to Cauchy: 


Dini’s Theorem. !° Let (f;,) be a monotone sequence of continuous real- 
valued functions defined on a compact set K C C and converging simply 
to a limit function f. Then f is continuous if and only if the f, converge 
uniformly on K. 


We can assume that the given sequence is increasing, whence f(x) = 
sup f(a) for every « € K. For every r > 0 and every a € K, we then have 


f(@) > fn(a) > f(a) —r for n large. 


If f is continuous, this relation is, for n given, again true on a neighbourhood 
of a. By BL, we can then find a finite number of points a, € K and open 
balls B(a,) covering K such that each relation 


15 The contents of n° 10 and 11, preparation for the Lebesgue theory, will be 
repeated in a more general framework in § 9, and in the Appendix to this chapter; 
we shall use neither the results of these two n° nor those of § 9 before the chapter 
devoted to them. Our aim here is to show the reader that it is not difficult to go 
rather further than the traditional theory, the essential being to know how far 
too far not to go... 

Having followed the analysis courses of Joseph Bertrand and J.A. Serret at Paris 
in 1866, as Dugac tells us in his thesis, p. 106, and having conceived serious 
doubts as to the rigour of their ideas, doubts which his youth dissuaded him from 
making public, Ulisse Dini, professor at Pisa (where there is an “Ecole normale 
supérieure” which has produced a number of excellent Italian scientists), read 
the Germans, educated himself on Weierstrass’ course, and, in 1878, published in 
Italian the first exposition of analysis according to Weierstrass’ ideas and those 
of his numerous disciples, followed in 1880 by a book on Fourier series. His book 
was widely read because neither Weierstrass nor his disciples published anything 
beyond duplicated manuscript courses of very limited distribution. 


16 
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(10.1) ce KN Bap) = f(x) => fr(x) > f(a) -r 


is separately true for n large. These relations being finite in number, they 
are simultaneously true for n large — unnecessary to rely on the N, and their 
maximum ...— and since the B(a,) cover K, this means that, for n large, 
we have 


f(x) 2 fala) > fla) —r 


for every x € K, therefore || f — frllk <r, qed. 

Exercise. Prove the theorem using BW. 

Consider for example, for 2 > 0, the sequence f,(2) = n(al/” — 1) of 
Chap. IT, n° 10; for x > 1, it is decreasing and tends to log x; the convergence 
is therefore uniform on [1, b] for any b > 1 (try to prove this “by hand” ...). 
The case of an interval [a,1] (a > 0) reduces to the preceding on putting 
x =1/y. We therefore have uniform convergence on every compact K C Ri‘. 
Same conclusion for the sequence (1+ 2/n)” for x > 0. 

Dini’s Theorem holds not only for increasing sequences but also for what 
we shall call increasing philtres ” of continuous functions; this terminology!® 
denotes any family (f;)ie, (not necessarily countable) or set ® of real func- 
tions defined on an arbitrary set and possessing the following property: for 
any functions f and g in the family, or the set, there exists in the family, 
or the set, a function h that majorises f and g simultaneously. The most 
frequent case is that where 


(fEe%) & (ge %) = sup(f,g) € ©. 


The definition applies to functions defined on any set: the values of the func- 
tion, not of the variable, have to be real. This is trivially the case for an 
increasing sequence. This is also the case, on an interval of R (or, more gen- 
erally, in a metric space), of the set of continuous functions which are less 
than a given function. Similarly one defines decreasing philtres by reversing 
the sense of the inequalities. 

To extend Dini’s Theorem to this general case, let us consider an increas- 
ing philtre @ of continuous real functions on the compact set K C C and 
assume that the function 

g(x) = sup f(x), 
fEe® 
the upper envelope of @ (i.e., in the case of an increasing sequence, its limit), 
is everywhere finite and continuous. For every r > 0 and every a € K there 


1’ Translator’s note: shades of Isolde & Brangaene! 

18 A little less barbarous than N. Bourbaki’s “increasing filtering sets”; I use the 
spelling “philtre” because the word “filter” is employed in a different sense in 
general topology. I have known the Bourbaki milieu well, and myself absorbed 
bourbachique philtres during the “grande époque” of filters enough to think that 
my spelling corresponds better to the psychological background of the subject 
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exists an f € & such that y(a) —r < f(a); since f and y are continuous this 
inequality is again valid on a neighbourhood of a in K. By BL, one can then 
find a finite number of a, € K and a finite number of f, € @ and balls B(a,) 
covering Kr, such that 


p(x) — 7 < fp(z) on KM B(ap) 


for every p. Since @ is an increasing philtre and since the f, are finite in 
number, there exists a f € ® which majorises!? the f,. Then a fortiori 
yp(x)—r < f(x) in KNB(a,) for any p, so in all K. Since anyhow f(x) < v(x), 
one finds finally that ||y— f||K <r and, trivially, that ||\p—g||x <r for every 
g € ® majorising f. This is Dini’s Theorem in this more general framework. 


Since the continuous functions are integrable over a compact interval K 
of R the result we have just obtained shows that, in this case, 


m(y) = sup m(f), 


where we revert to the notation m(f) = [ f(x)dx of n° 2 for integration 
over K. The left hand side is greater than the right hand side since y ma- 
jorises all the f € ®; but the existence of an f such that |p — fllk <1, 
hence such that the integrals of y and f are equal to within m(K)r, shows 
that in fact the two sides are equal. This argument calls on no more than 
Theorem 1; the explicit construction of the integral features here no more 
than in the preceding n°. 

Dini’s Theorem serves, in the Bourbaki version which we shall follow 
approximately, as the point of departure on the “grand” theory of integration 
in view of the following result, in which connection the reader is invited to 
revise the generalities of Chap. II, n° 17 on infinite limits: 


Corollary 1. Let K be a compact interval and (fn), (gn) two everywhere in- 
creasing or two everywhere decreasing sequences of real continuous functions 
on K. Assume that lim f(x) = lim g,(x) for every x € kK. Then 


lim m( fn) = lim m(gn) 


or, in traditional notation, 


lim [ fn(x)dx = lim [ Gn(x)dx. 
K K 
Consider for example the case of increasing sequences, put 


(x) = sup fn(w) < +00 


19 Tf for example one has three functions f,g,h in ®, then there exists ak € @ 
which majorises f and g, then a p € ® which majorises k and h, so majorising 
f,g and h. 
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and consider the set Cing(y) of all real functions h defined and continuous on 
K such that h(x) < p(x) for every x; we shall show that 


(10.2) supm(fr) = sup m(h), 
hECine (Y) 


which will establish the corollary since the result does not involve the par- 
ticular sequence (fp). 
Put 
M =supm(fn) = limm(fn) < +00 


and h, = inf(h, f,) for every continuous function h < y. The h, are < h 
and form an increasing sequence of continuous functions like the f,. For 
every x € K and every r > 0, we have h(x) —r < fn(x) for n large since 
h(z) < v(x) = sup f(x): this is condition (SUP 2’) in the definition of an 
upper bound (Chap. II, n° 9). Therefore h(x) —r < h»(x) for n large, and 
since h,(a) < h(x) we conclude that h(x) = suph,(a) for every x. 

By Dini’s Theorem the h, < f, converge uniformly to h, whence m(h) = 
lim m(hn) < lim m(fn) = M. This inequality holding for every continuous h 
< y we may deduce that the right hand side of (2) is < M. But among the 
h € Cint(y) are the f, themselves, so that the right hand side of (2) majorises 
m(fn) for every n; it is therefore > M. Whence (2) and the corollary, with, 
moreover, the more precise result (2), qed. 

It is almost obvious that the preceding corollary still holds if one sub- 
stitutes increasing philtres ® and W of continuous functions in place of the 
sequences f, and gn: 


sup f(z) = sup g(x) > sup m(f) = sup m(g). 
fee gEev fee gev 

To see this, first consider an h € Cing(y) and the functions inf(f,h) where 
fe @; if f’, f” € Gand if f €  majorises f’ and f”, it is clear (sketch!) that 
inf(f,h) majorises inf(f’, h) and inf(f”, h); the functions inf(f,h) thus form, 
for h given, an increasing philtre of continuous functions whose upper enve- 
lope is, as above, the function h itself. By Dini’s Theorem for philtres, m(h) is 
then the upper bound of the integrals of the inf(f, h), themselves majorised by 
the integrals of the f € ®; we conclude that sup m(h) < sup m(f); but since 
® C Cint(y), the opposite relation is obvious, whence sup m(f) = sup m(h) 
and, likewise, = sup m(qg). 

The preceding corollary leads to a simple proof of a result which the 
Lebesgue theory allows one to extend to arbitrary series of integrable func- 
tions, though clearly with more work: 


Corollary 2. Let 5° un(x) be a series of continuous functions on a compact 
interval Kk. Assume that the series converges simply to a continuous function 
s(x) and that 
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(10.3) al |un(x)|da < +00. 


Then 


(10.4) I s(x)dx = SD i fies eayal 


To prove (4), one may assume s = 0 by replacing uw; by wu; — s, which 
is again continuous. One may also assume the uy, real and then use the 
decomposition un, = u;>—u;, of n° 2. These positive functions again satisfy the 
hypothesis (3), since |u;t| < |un|; and since s(x) = 0 we now have )¢ uj (x) = 
So uz (a) < +00 for every x. Since a series with positive terms leads to an 
increasing sequence on considering its partial sums, Corollary 1 shows that 


Since the two sides are finite by (3) and the inequality m(u;,) < m(|un|), we 
have 5> m(u,) = 0 on subtracting, qed. 

Once again, only the formal properties of Theorem 1 are needed for the 
proof. 

Condition (3) is satisfied if the given series is normally convergent on K, 
but the hypothesis (3) is weaker, even though in elementary practice one 
almost always verifies (3) by normal convergence. 


Corollary 1 for increasing sequences, or its “philtrological” version, and, 
more precisely, the relation (2), lead us to put 


(10.5) m*(y)= sup m/(f) 
fECine (Y) 


for every function y which can be exhibited as the limit of an increasing 
sequence of continuous functions or, more generally (?), for which 


(10.6) y(x)= sup f(z) 
fECins(¥) 


for every x € K; such a function takes its values in ] — 00, +00]. As we saw in 
Corollary 1 one might define m*(y) replacing Cing(y) by any other increasing 
philtre ® of continuous functions with upper envelope y. If m*(y) < co we 
shall say that y is integrable and put m(y) = m*(y), the integral of y. As 
we shall see in the following n° this generalisation? of the Riemann integral 
has even simpler properties than the former, despite the fact that it applies 
only to very particular functions; but all these properties will be extended 
later to general integrable functions. 


0 Generalisation — since if y is continuous the “new” definition of m(y) reduces 
to the old, for then y € Cint(y). 
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First, let us elucidate a crucial point: how to characterise those functions y 
satisfying (6) by properties of an “internal” nature? These are the lower 
semicontinuous functions, or, for short, the Isc functions of Baire. 

Let us work on a not necessarily compact interval X of R, and consider 
on X a function y with values in |] — co, +00] satisfying (6), ie. which is 
the upper envelope of a family of continuous real functions (which clearly 
excludes the value —oo); for example the function 1/x?(z — 1)? on R, with 
value +co for « = 1 or 0. For every a € X and every M < (a), there is, by 
the definition of an upper bound, a continuous function f on X satisfying 


f(x) < v(x) for every z, f(a) > M. 


Since f is continuous, we again have f(z) > M on a neighbourhood of a, and 
since y majorises f, it follows that 


(10.7) pla) > M => v(x) > M for every x € X near a. 


This is the property which defines the lsc functions; equivalently, one may 
demand that, for every finite M, the set {p > M} of the « € X where 
p(x) > M must be open in X since then, if it contains a, it must also 
contain all the points of X sufficiently near a. Whence we deduce that the 
sets {y < M} are closed”?. 

If y(a) is finite, we may, in (7), choose M = y(a) — r with an arbitrary 
r > 0, whence 


(10.8) p(x) > y(a) —r on a neighbourhood of a, 


ie. for every x € X such that |x — a] <r’, in our usual notation. Continuity 
would force y(x) < y(a)+r too, but this is precisely what we do not demand 
of the Isc functions, whence their name. The continuous functions are char- 
acterised by the fact that both f and —/f are lsc. For a regulated function y 
condition (8) is equivalent to saying that the right and left limit values of w 
are > y(a) at every ac X. 

The reader can easily check that 


(i) the sum of a finite number of Isc functions is Isc, 

(ii) if y and w are lsc, then so are the functions sup(y, a) and inf(y, v), 

(iii) the upper envelope sup y;(x) of a finite or infinite family (y;) of Isc 
functions is again lsc, 

(iv) the sum, finite or not, of a series of positive lsc functions is again Isc. 


Properties (i) and (ii) are proved by imitating what we have established 
for continuous functions. (iii) is a direct application of the definition of upper 
bounds — one can never say too often that the only useful “property” of upper 


21 To distinguish weak from strict inequalities is as crucial in all these questions as 
to distinguish open from closed sets. 
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bounds is their definition; (iv) follows from (i) and (iii) since the partial sums 
of the series form an increasing sequence. Property (ii) shows in particular 
that if y is lsc, then the functions inf[y(x),n] obtained by “truncating” the 
graph of y above a horizontal n are again Isc; the converse follows from (iii). 


(v) the characteristic function yu of a subset U of X, equal to 1 on U and 
to 0 elsewhere, is Isc on X if and only if U is open in X. 


The set {yy > M} is X if M <0, U if0 < M <1and empty” if M > 1. 
One must pay attention to the fact that “open in X” does not mean the 
same as “open in R”, unless X itself is open. 

Since the lsc functions are “half continuous”, one might assume that they 
“half” satisfy the theorems applicable to continuous functions. This is some- 
times justified: 


(vi) let y be an Isc function on an interval X and K a compact subset of 
X; then vy is bounded below on K and there exists a point of K where yp 
attains its minimum. 


For every n € N the set A, = {y < —n} is closed in X, so that A, NK is 
compact; these intersections form a decreasing sequence so have a common 
point a if they are all nonempty (Corollary 1 of BL). Absurd, since then we 
would have y(a) = —oo, an eventuality excluded by the definition of the Isc 
functions. 

Now let m be the lower bound of the y(x), « € K. For every n € N the set 
K,, of the « € K where v(x) < m+1/n is nonempty (definition of a lower 
bound) and closed (definition of the lsc functions); since the K,, decrease 
they have a common point c € K, and clearly y(c) = m, qed. 


We have seen above that every function y satisfying (6) on an interval X 
is lsc; the converse holds if one assumes that there is a continuous function 
f < yon X, and thus, by (vi), if X is compact. Since y — f = y+ (—-f) 
is lsc, it suffices to treat the case of a positive function. For a € X and 
M < y(a) given, it reduces to constructing a continuous function f < » 
satisfying f(a) > M. Now there exists an r > 0 such that we again have 
p(x) > M for those x € X such that |x — a] < r. Figure 6 shows the 
construction of f, and does not require comment. We could in fact construct 
an increasing sequence of continuous functions converging to y [Dieudonné, 
Vol. 2, (12.7.8)], but this is quite unnecessary for the needs of integration 
theory, because of (2). 


22 The empty set is open because, not containing any point, it has no difficulty in 
satisfying the definition of an open set (all who live at least 500 years end up 
dying in a car accident). Moreover, since the complement of the empty set is the 
whole space, which is closed, it must be open. This argument also shows that 
the empty set is closed. 
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Fig. 6. 


In all the above we have dealt with the upper envelopes of continuous 
functions, but of course the lower envelopes of such functions, the upper 
semicontinuous functions or usc functions, are no less important. These take 
their values in [—oo, +o0[. One passes trivially from Isc to usc by remarking 
that 

y islsc = > -y is usc. 


You may therefore, if it appeals to you, translate all the properties of the 
Isc functions into properties of the usc functions: it is enough to reverse the 
sense of all the inequalities and to replace the word “increasing” by the word 
“decreasing” everywhere. There is a theorem on the maximum, and not on the 
minimum, for usc functions on a compact set. Every usc function majorised 
by a continuous function is the lower envelope of the continuous functions 
which majorise it; this is always the case of a usc function on a compact 
interval by the maximum theorem. Likewise, the characteristic function of a 
set is usc if and only if the set is closed. 

Finally, it is clear that the continuous functions are the only functions 
that are simultaneously Isc and usc. 

For a usc function 7 on a compact interval K let Cyup(w) be the set of 
continuous functions f > w; we then put 


(10.9) m*() = inf m(f) > -00, 


s fECsup (7) 
so that m*(w) = —m*(—w) where the right hand side is the integral of an 


lsc function. 


11 — Integration of semicontinuous functions 


Let us now return to the integrals of Isc functions over a compact interval 
K; these functions are bounded below but not above, so that their integrals, 
defined by (10.5), are > —co but < +00. The essential point in the proofs is 
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that, in (10.5), one can replace Cing(y) by any increasing philtre of continu- 
ous functions having upper envelope y. 


(i) Additivity of the integral: 


(1121) m*(p+y) =m*(y) +m*(p). 
Let ® be the set of functions of the form f + g with f € Cine(y) and 


g € Cine(w). It is clear that @ is an increasing philtre of continuous functions 
— apply the definitions - whose upper envelope is”? y + #. Hence 


m*(y +) =supm(f +g) = sup m/(f) + sup m(g) = m*(~) + m*(y). 


Similarly one can show that m*(Ay) = Am*(y) for every constant » > 0, 
and even if \ = 0 so long as we define 0.+00 = 0. (Multiplying an Isc function 
by —1 makes it usc.) 


(ii) Passage to the limit under the [ sign in an increasing sequence: 
(11.2) m*(sup Yn) = sup m* (yn) < +00. 


Let p(x) = sup yn(x). Put ®, = Cint(yn) for every n and let & be the 
union of the @,, i.e. the set of continuous functions f satisfying f < Yn 
for some n. This is an increasing philtre: for if f < yp and g < gq then 
sup(f,g) = h < y, for r > max(p,q), and consequently h € @. Finally, vy is 
the upper envelope of the f € &, for 


p(x) = sup Yn(x) = sup sup f(x) = sup f(z) 
n feb, fEU Pn 


by the associativity of upper bounds (Chap. II, end of n° 9). We conclude 


that 


m*(p) = sup m(f) = sup sup m(f) = sup m* (yn) 
fce n feb, n 


by the definition of m*(y,), qed. 


(iii) Integration term-by-term 


(11.3) m* 63 Pn) = J ~m*(¥n) < +00 


for every series of positive Isc functions. Write s and s,, for the total sum 
and the partial sums of the series of the y,. Since the y,, are positive these 
partial sums form an increasing sequence of lsc functions of which s is the 
limit. The integral of the left hand side is thus the limit of the integrals of 


?3 We have already mentioned that if A and B are two subsets of R and A + B is 
the set of u+v with u € A and v € B, then sup(A + B) = supA+sup B. 
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the s, by (ii), ie., by (i), of the partial sums of the series of the integrals, 
qed. Note that if the y, are not positive the sum of the series need not even 
be Isc. 

If, in particular, the functions y, satisfy m*(~n) < +00, ie. are m- 
integrable (by definition), and if }> m*(y~n) < +00, then the sum of the series 
is again integrable and one may integrate it term-by-term. In particular, for 
positive lsc functions, 


(11.4) m*(~n) =0 for every n => m* es Pn) =0. 


Since the characteristic function of an open set U in K is lsc one may 
define the measure of an open set by putting 


(11.5) m(U) = m*(xu), 
a number clearly lying between 0 and m(K); it is clear more generally that 


(11.6) UCcCV=mlU) <m(V) 


since then yy < xv. It is easy to see that, when U is an interval, m(U) reduces 
to its usual length; for this obviously majorises m(f) for every continuous 
function f < xy (ie. < 1 on U and < 0 elsewhere), but on replacing the 
discontinuities of the graph of the characteristic function at the end-points 
of U by almost vertical line segments joining 0 to 1, one constructs functions 
f whose integral is arbitrarily close to the length of U. 

Properties (i), (ii) and (iii) above translate immediately: 


(i) of U and V are open in K then m(U UV) <m(U)+m(V) and 
(11.7) m(UUV)=m(U)+m(V) if U and V are disjoint. 
Obvious, since, in the last case, we have yuuv = xu + Xv- 

(ii’) if (Un) is an increasing sequence of open sets then 
(11.8) mJ U,) = limm(U;,) = sup m(U;,). 


Obvious since the characteristic function of the union is the limit of the se- 
quence, increasing, of those of the U,,. 


(iii’) if (Un) is any sequence of open sets then 


(11.9) m Os Un) < S>m(Un) 


with equality if the U, are pairwise disjoint. 
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Obvious, since the characteristic function of the union is less than the sum of 
the characteristic functions of the U,, and is equal to it if the U,, are pairwise 
disjoint. 

This allows us to calculate the measure of any open U C K explicitly. 
First, note that for every a € U the union of all the intervals containing a 
and contained in U is again an interval U(a), clearly open in K like U itself, 
and of length > 0; U is the union of these U(a). It is immediate that, for 
a # 6, either U(a) = U(b) or U(a)NU(b) = 9. The set (and not the family) of 
the U(a) is countable, for those of these intervals which are of length > 1/p 
are at most m(K)p in number since they are pairwise disjoint. Thus we see 
that every open subset U of K (and, in fact, of any interval, compact or not, 
and in particular of R) is the union of a finite or countable family of open 
pairwise disjoint intervals. The measure of U is then, by (iii’), the sum of the 
measures of these intervals. 

We leave the reader the task of translating all these properties into terms 
of usc functions and of closed sets. To go further along this path would 
oblige us to develop all the Lebesgue theory. The reader may find these 
considerations insufficient: for, at this point of the exposition, (i) we are not 
yet able to integrate the difference of two Isc functions for the reason that it 
is neither Isc nor usc?4, (ii) we have considered integrals only over compact 
intervals. These limitations will be removed in the Appendix to this Chapter. 

Exercise. We say that a set N C K is of measure zero if, for every r > 0 
there is an open U C WN such that m(U) < r. (i) Show that the union of a 
finite or denumerable family of sets of measure zero is of measure zero (use 
the fact that r = 5+ r/2”). Show that Q7K is of measure zero. (ii) Let y bea 
positive lsc function such that m*(y) < +00; show that the set {y(a) = +oo} 
is of measure zero (use the sets {y(a) > n}. 


24 The ingenious reader will observe that if y’, y”, w’ and w” are Isc functions 
with finite values such that y’ — W’ = yo” — W", then yp’ +” = vy" +’, so 
mig!) + m(ws") = m(y") + m('), so mg’) — my") = me") = mC"). One 
may thus define m(@) = m(y’) — m(w’) without ambiguity for every function 6 
which can be expressed in the form of a difference of two positive lsc functions 
with finite values. These functions form a vector space over R, etc. But this is 
not the best method for obtaining the general integrable functions: we still lack 
“almost everywhere zero” functions. 
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§ 3. The “Fundamental Theorem” (FT) 


12 — The fundamental theorem of the differential and integral cal- 
culus 


Let us return to much more elementary considerations and introduce the 
notion of an oriented integral, analogous to that of a vector on a line. To do 
this, first observe that we always have 


(12.1) [ f(a)dx + [ twa = a f(a)dx 


ifa <b <c. This is geometrically obvious, and has been proved in n° 2, 


additivity formula (2.8). 
c c b 
eee 


(1) shows that 
we may then write this relation in the form 
c a c 
La 
b b a 
agreeing that 


(12.2) if: f(a)dx = — rh f(ajdx ifudv. 


As in the case of the vectors, the relation (1) remains valid with no hypothesis 
on the respective positions of a, 6 and c. 

Having said this, let f be a scalar function defined on an interval I (of 
any kind), and assume that f has right and left limits at each point of J, ie. 
that f is regulated. We may then, for a given a € J, consider the function 


(12.3) F(a) = a f(t)dt 


on J, with an oriented integral in the preceding sense, so the opposite of 
the ordinary integral for « < a. We have denoted the phantom variable of 
integration by t so as not to confuse it with the variable x in F’; one can 
replace t by y, u, $ or anything one likes, except x, f ord... 

By the properties of oriented integrals 


ath 
F(a+h)— F(a) = / f(t)dt. 

This relation shows first of all that F' is continuous: since f is bounded on 

every compact interval K c I, the preceding integral is O(h) when h tends 

to 0. 
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If, on the other hand, h is > 0 and small enough, then the function f is, 
on |x,x +h], almost equal to the limit value f(x+), so that its integral is 
almost equal to hf(x+) since the value taken by f at the point x, or at any 
other individual point, has no influence on that of the integral; the quotient 
[F(a+h)— F(x)]/h is therefore almost equal to f(x+), so tends to this limit 
as h > 0 tends to 0. 

Weierstrass’s E’psilontik is missing from this argument. To introduce it, 
first observe that 


ath 
: f (a+)dt = hf (x+) 


because f(x+) is not a function of the variable of integration ¢, in other 
words, behaves like a constant function of t for x given. (Whence, once more, 
the necessity of not mixing the phantom or bound variables like t and the 
free variables like x.) Then 


x — F(x sas 
(za) FREE) asp tf YW = sot)la. 


Now for every r > 0, there exists an r’ > 0 such that 
x<t<atr => |f()— fet] <r. 


The right hand side of (4) is then, in modulus, < r. We argue similarly for 
h <0, replacing the interval ]x,x+7r’| by an interval Ja — r”,2[ and f(x+) 
by f(z—). The left hand side then tends to 0, whence: 


Theorem 11. Let f be a regulated function on an interval I in R. Then the 
function F defined by the relation (3) is continuous and has right and left 
derivatives equal to f(a+) and f(a—) at each point x € I. 


This result, for functions as then understood, is already in Newton in 
1665-66 with essentially the same proof, phrased in his language of fluentes 
and fluxions (Chap. III, n° 14): if y is the fluent which defines the curve 
[y = f(x) in the language of today] and if z is the area [i.e. the integral] 
between a fixed abscissa and the abscissa of the fluent x, then the infinitesi- 
mal increase Zo of z is the product of y by the infinitesimal increase xo of x, 
which means that z/# = y; if one assumes that « = 1, in other words that 
x is the “time” with respect to which one derives one’s fluentes to calculate 
their fluxions, one has z = y, which, even in his conception of derivatives as 
“speeds of variation in time”, means that the derivative of the area z with 
respect to the variable x is the ordinate y of the graph at the point 2; in 
Leibniz’ style, dz/dx = y. For them, calculating the area is the same as cal- 
culating a fluent z satisfying this relation. Newton justifies nothing, not even 
the fact that the relation z = y determines z up to an additive constant, 
which is, however, the crucial point; a few lines suffice for him to formulate 
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his result?° which he illustrates by examples. Chez Leibniz, everything is sim- 
ple too: since F(x) is the “continuous sum” of the infinitely small quantities 
f(t)dt where ¢ varies from the left hand end of the area to x, the infini- 
tesimal increase in F’ when one passes from x to + dz is f(a)dx, whence 
dF = f(a)dx and f(x) = dF/dz. Here again, the justifications awaited the 
XIX* century, but since the method worked admirably, for 150 years no one 
bothered to provide the rigorous proofs by “exhaustion” of Chap. IT, n° 11... 


Recall that the set D of discontinuities of a regulated function f is finite 
or countable (Theorem 6 or Chap. III, n° 12). Outside D the function F is 
therefore differentiable, with F’(x) = f(z). 

When the function f is continuous, the function F is even differentiable 
on all the interval considered, and 


(12.5) Ee) = "fay for every x 


in this case. One says then that F is a primitive of f. These arguments prove 
the existence of a primitive for every continuous function. This result is not 
at all clear for a function which, though continuous, may be so savage that 
one cannot represent it graphically. 

Now, in contrast to Newton, who did not even pose the question since he 
did not see (and one still does not see it ...) how a graph all of whose tangents 
are horizontal could be anything other than a line, we know (Chap. III, n° 16) 
that if two everywhere differentiable functions on an interval have the same 
derivatives everywhere, then their difference is constant. Since the addition 
of a constant to the function F' defined by (3) does not change the difference 
F(a) — F(a), we obtain the following result: 


Theorem 12 (FT). Let f be a scalar function defined and continuous on 
an interval I in R and let F be a primitive of f, t.e. a differentiable function 
such that F’(x) = f(x) for every x € I. Then 


b 
(12.6) / f(x)dx = F(b) — F(a) 


for any a,be TI. 


In this way we find the results of n° 4, Example 1, again, for analytic 
functions, but in a much more general framework. 


Example 1. For « > 0 and s € C, the function 
x* = exp(s. log x) 


25 Tractatus de Methodis Serierum et Fluxionum, pp. 195-197 and 211 of Vol. III 
of the Mathematical Papers. 
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has derivative sz°~! [Chap. IV, formula (10.10)]. The function «°*1/(s + 1) 
is therefore, for s 4 —1, a primitive of x*. Whence the formula 


b peti — gstl 


already obtained for s € N by a direct calculation of the integral (Chap. II, 
n° 11), but now valid for any s EC, s 41. 


Example 2. For « > 0 the function log x has derivative 1/x; hence again the 
formula 


b 
/ dx/x = log b — loga (0 < a,b) 
of Chap. IT, n° 11. 


Example 3. The derivative of the function arctan x is 1/(1 + x”); whence 


b 
xv 
/ ——~ = arctan b — arctana; 
4 eae? 


we must pay attention to the fact that, in this calculation, we take the “prin- 
cipal determination” of arctan, that given by the relation 


y =arctans => {(w7=tany) & (ly| < 2/2)} 
or, equivalently, the inverse function of 
tan: | —7/2,7/2| —>R. 


Example 4. For c € C, c € 0, the derivative of e“’/c is e“”; we again find the 
formula 


b 
/ é“dr = (e? — e%)/e. 


In practice, we often use the notation 
(12.8) F(b) — F(a) = F(2)} ; 


this contradicts all the most elementary rules of mathematical logic with its 
x which might be a t, a # or a £ and which, despite its clearly phantom 
character, is not linked visibly to any of its occurrences. It would be more 
correct to write 


or even F( ) = : 


F(a) 


(i 


especially when F' depends on several variables x, y, etc. But as we have said 
already, one does not change society, even mathematical society, by ukase. 
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To denote a primitive of f, there is another notation, long universal, and 
even more catastrophic, namely 


Plo) = f Hote, 


omitting the limits of integration. Since the letter x on the right hand side 
denotes a bound variable and that on the left hand side a free variable, all the 
tabus are violated?°. The inventors of this system probably knew what they 
were doing; Leibniz printed it for the first time in 1686 in his aptly named 
Geometria recondita, and for example wrote that 


[ow = 97/2, 


the word “integral” being introduced by Jakob Bernoulli in 1690 (Cantor, 
pp. 197, 218). But their principal reason is that they were much more con- 
cerned to calculate primitives rather than integrals between well defined lim- 
its, and, as we have already said, one had to wait for Fourier for the idea of 
displaying the integration limits in the integral notation. Imagine the con- 
fusions that this system must have provoked among less brilliant people, if 
not chez Leibniz, the Bernoullis or Euler — brains of this calibre are not born 
everyday. Not to ignore that relations such as 


[cos nate = sing, [ele = log a, 


etc. may induce the same confusions even nowadays... 


Another way of formulating the preceding theorem starts from a function 
f of class C1, i.e. having a continuous derivative. Since f is a primitive of f’ 
one finds the relation 


b 
(12.9) (8) — fla) = fF ade, 


as fundamental as the FT, for any a and 6 in the interval J considered, or, 
in the language of indefinite integrals, 


(12.10) lo) = f (a)de, 


6 We shall use this notation frequently, and have already used it, though in quite 
another context, namely to denote an integral extended over an interval men- 
tioned repeatedly in the calculations, and where no ambiguity is possible. This 
convention or abbreviation, which often allows us to write the integrals in the 
body of the text in clear language instead of using an extra line each time, 
economises on type and paper. 
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or even f(x) = fdf(z). In this form, the reciprocity between derivative 
and integral appears clearly. In the version (9) one chooses a subdivision 
A= 2X1 <4. <...< Uni = b of [a,b], writes that 


(12.11) f(b) — f(a) = S Uf (waa) — f(a] 


and observes that the difference f(xj+1) — f(a) is “almost” equal to 
f' (vi) (@i41 — v;) and, in fact, exactly equal to f’(&;)(ai+1 — x;) for some 
& €]xi,vi41| by the mean value theorem if f is real. Substituting in the 
above expression for f(b) — f(a), one finds exactly the Riemann sums which 
define the integral (9). This type of argument was clearly known to Leibniz; 
it is linked to “the calculus of finite differences” so popular in the XVII** 
and XVIII* centuries. Leibniz’ notation makes these results as intuitive as 
they must have been almost obvious in the eyes of the contemporaries who 
did not worry about arithmetising analysis, whence their popularity. 

The formula (9) explains Theorem 19 of Chap. III, n° 17 on differentiating 
a uniform limit. Suppose we are given a sequence of functions f,, of class C1 
on an interval J, and assume that the f/, converge to a limit g uniformly on 
every compact K € I. Then, by Theorem 4 of n° 4, 


212) f g(t =tim f(t = timl aCe) ~ ful) 


for any a,x € I. If the sequence f, converges at the point a, it must then 
converge everywhere to a function f, which, satisfying 


(12.13) Fla) — Hla) = f° o(tat 


by (12), is a primitive of g; in other words, the derivative of the limit is the 
limit of the derivatives. An immediate estimate of the integral of g— f/, then 
shows that the sequence f, converges uniformly on every compact set. But 
Theorem 19 of Chap. III only assumes the existence of the f/,, and not their 
continuity. 

We established a theorem on “differentiation under the f sign” for inte- 
grals of the form f f(z,y)dx above. We may combine this with the FT to 
obtain the following occasionally useful result: 


Theorem 13. Let I and J be two intervals, f a continuous function on Ix J 
and let p,w : I — J be two differentiable functions. Suppose that f has a 
continuous derivative Di f on I x J. Then the function 


(x) 
(12.14) g(z) = if Sealey 


is differentiable on I, and 


v(x) 
(12.15) g@=[ Di f(z, y)dy + fla, b(a)}b"(2) — fle, e(@)]¢"(2). 
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By subtracting, one need only prove this in the case where y(x) = b is 
constant. Put 


y 
Fle =f fle.tat, 
b 
whence g(x) = Fx, w(x)]. Since f is continuous the FT shows that 
(12.16) DoF (x,y) = f(x,y): 


Theorem 9 shows on the other hand that 
y 
(12.17) D,F (x,y) =} Df (ax, t)dt. 
b 


By Theorem 9 (i) D,F' is continuous on I x J, and since DoF = f is also 
continuous we see that F is a function of class C' on I x J. We may thus 
apply the chain rule to g(x) = Fa, #(x)] (Chap. III, n° 21), whence 


g(x) = Di F(x, 0(@)] + DoF lx, Y(a)|b"(a); 


one obtains the desired formula on substituting ¢)(x) for y in (16) and (17). 


Exercise. By the theorem, the function 


2 


g(a) = | sin(ey)dy 


has derivative 


¢ (2) = i, cos(axy)ydy + 3a” sin(a*) — 2a sin(x). 


2 
Check this result by calculating the integrals that appear in g and g’ (for the 


second, integrate by parts). 


Exercise. Prove (15) directly by writing 


w(a+h) (a) 
g(a +h) — g(2) = | fle +hyy)dy— | Heyday. 


Since we have just made a new incursion into the functions of two real 
variables, let us show how one may exploit the FT to establish one of the 
fundamental results of Chap. III, n° 23: 


Theorem 14. Let f(x,y) be a function defined and continuous on I x J 
where I and J are two intervals in R. Assume that f has continuous second 
derivatives D,Djf and D2D,f on I x J. Then they are equal. 
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Since it is enough to verify the statement on a neighbourhood of an ar- 
bitrary point of I x J one may reduce to the case where I = [a,b] and 
J = [c,d] are compact. The FT applied to the functions y + D2D, f(x,y) 
and «+> D, f(x,y) then shows that 


(12.18) fel D2D; f(x, y)dy = 


i ‘Diftss) Die Ola= Eo) Fed 


for any u€ J and v € J. A similar calculation will show that 


y=v 


a2z19) | of / DDGe airs Fay) 


y=e 


a result obviously identical to that furnished by (18). But we already know 
that the order of integration is unimportant in these double integrals. Putting 


g(x,y) = DeDif(z,y) — DiDof(z,y), 


we thus obtain a continuous function on I x J such that 


(12.20) i dx [ g(x, y)dy = 0 


for any u € I and v € J. On differentiating with respect to u one finds that 
the integral of g(x,y) between c and v is zero for any v. On differentiating 
with respect to v, one obtains g(x,y) = 0, qed. [Note that Chap. III, n° 23, 
only requires D, f and Dzf to be differentiable at the point (2, y) considered]. 


13 — Extension of the fundamental theorem to regulated functions 


Let us return to the usual integral and to the “fundamental theorem”. The 
hypothesis that f is of class C! is not indispensable in justifying formula 
(12.9); it still holds if, for example, f’(x) is a regulated function. For this, and 
as N. Bourbaki and also Dieudonné (Eléments d’analyse, Vol. 1, Chap. VIII, 
n° 7) do, let us say that for a regulated function f defined on an arbitrary 
interval I any continuous function F' which has a derivative equal to f(x) 
outside a countable subset D of I is a primitive of this regulated function. 


Theorem 12 bis. Let f be a regulated function on an interval I CR. Then 
(i) f has a primitive F in I; (ii) any two primitives of f are equal up to an 
additive constant; (iii) we have 


b 
[ f@de= FO) - Flo) 


for any a,b € I; (iv) every primitive F of f has right and left derivatives 
Fi(x) = f(x+), Fy(x) = f(x—) at every point. 
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Point (i) follows from Theorem 11: the continuous function F' of Theo- 
rem 11 satisfies (iv) and since f is continuous outside a countable set D, F 
has a derivative F’(x) = f(x) for x ¢ D. 

Since (iii) is valid for a particular primitive, (iii) will be established if we 
prove (ii), in other words that a continuous function having a zero derivative 
outside a countable set D is constant. For this it is enough to show that if 
the derivative is > 0 outside D, then the function is increasing, for a function 
which is simultaneously increasing and decreasing has no choice. 

We shall give two rather different and extremely ingenious proofs of this 
theorem?”. 

It is enough to establish this result when f’(x) is strictly positive every- 
where on I—D, since if f(a) +ex, which satisfies this hypothesis, is increasing 
for every € > 0, then so clearly is f in the limit. One may also restrict oneself 
to examining the case where I is compact, since, to show that a < b implies 
f(a) < f(b), it is enough to argue on the interval [a, 0]. 

The basic idea is that the ratio [f(a+h)—f(x)]/h is > 0 for h small enough 
since it tends to f’(a) > 0; then f(a) < f(a+h) for every small enough h > 0; 
it remains to pass from “locally” increasing to “globally” increasing, which the 
Founding Fathers considered obvious. 


First proof. Assume that f’(a) > 0 for « € I— D. If f is not increasing, 
there exist points c, d of I such that c < d, f(c) > f(d). For every number 
€¢€ |f(d), f(c|, the equation f(x) = € has at least one solution between c 
and d (intermediate value theorem). The set E(&) of these solutions is closed 
since f is continuous, and it is bounded since C [c,d]; it therefore contains 
the number sup E(€) = de, and we have de < d since f(de) = € > f(d). 

Let us show that if h > 0 is small enough that de +h < dthen f(de+h) < 
f(de) = €. If in fact f(de + h) > €, a number > f(d), then the equation 
f(x) = € would have a solution between de+h and d; absurd since sup E(£) = 
de <de th. 

It follows from this that, for every € € ]f(d), f(c)[ and every sufficiently 
small h > 0, 

[F(de +h) — f(dg)|/h <0. 


This is impossible if the derivative f’(d¢e) (or even only the right derivative) 
exists, since by hypothesis it is > 0; the function f is thus not differentiable 
at any of the points de, which proves that de € D. 

Now the map € + d¢ of | f(d), f(c)[ in D is injective because 


27 T find the first in Walter, Analysis I, pp. 354-359, who attributes it to L. Scheef- 
fer, 1885, the date when arguments a la Cantor began to be fashionable. For 
the second I have simplified the method of Dieudonné, Eléments ..., Vol. 1, 
Chap. VIIII, n° 6, which treats the case of functions with values in Banach 
spaces, and could in fact be deduced from the result for real-valued functions 
with the help of the Hahn-Banach theorem. Both authors prove the mean value 
theorem, below, directly. 
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EAn=> flde) =EAN= f(dn) => de F dy. 


If the theorem were false we would have constructed an injective map of an 
interval |f(d), f(c)[ of R into a countable set, absurd if one believes Cantor ... 

As we have seen in passing, it would suffice, to obtain (2), to assume that 
f has a positive right derivative outside D. The same remark applies to the 
proof that follows. 


Second proof. Here again we reduce to showing that f(a) < f(b) if 
f'(x) > 0 on I = [a, 8). 

Let us first explain the method in the simplest case (already explained 
in another way in Chap. III, n° 16) where f is differentiable for every x € I 
without exception. The set F of the x € [a,b] such that f(a) < f(x) contains 
a and is closed since f is continuous. Let c = sup(E) < b, whence c € E; 
assume that c < b. For all > cnear enough to c, one has f(x) > f(c) > f(a) 
since f’(c) > 0, whence x € E if « € I. If c < b then E contains points 
x > c=sup(F), absurd. Whence c = 8, qed. 

Now let us pass to the general case, where the derivative exists only 
outside D. First notice that if f(u) > f(a) at a point u € [a, 6], then again 
f(x) > f(u) [and so > f(a)] for every x > u close enough to u if f’(u) exists; 
in the contrary case, one may only guarantee, for every ¢ > 0, that one has 
f(x) > f(u) —« [and so > f(a) —«] for every x > u near enough to u since f 
is continuous. This indicates that if one moves from a to 8, one is forced, in 
seeking to prove the inequality f(x) > f(a), to subtract an error term from 
f(a) each time that one meets a point of D. If one can contrive that the sum 
of these errors will be < r when one arrives at the point b, one obtains the 
inequality f(b) > f(a) —r, which, valid for every r > 0, proves the theorem. 

One must therefore, for each € € D, allow an error r(€) chosen so that 
Yo r(€) <r. Since D is finite or countable, one need only write the points of 
D in the form of a sequence (&,) and choose, for example, r(&,) = r/2”. We 
will find this technique again in the Lebesgue theory. 

Let us now pass on to the formal proof. We denote by E the set of x € [a, D] 
satisfying the relation 


(13.1) f(x) = f(a) -— Sor = g(a); 


E<ax 


the series converges since it is a subseries of a convergent series of positive 
terms. All we have to prove is that b € E. 

The first claim to make is that the function g defined by the right hand 
side of (1) is decreasing: for y < x, the € < y are themselves < 2; consequently, 
g(x) is obtained by subtracting from g(y) the numbers r(€) > 0 for the € € D 
such that « > € > y. This argument even shows that 


(13.2) z>y ==> g(x) < g(y) — ry), 
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agreeing to put r(y) = 0 if y ¢ D. 

Now, as above, let c = sup(£); we show first that c € E. For every  € E 
we have x < cand so g(x) > g(c), whence f(x) > g(x) > g(c); since c is the 
limit of points of E, we also have f(c) > g(c) since f is continuous. 

It remains to prove that c = b. Now assume that c < 6, and let us consider 
the x € Jc,b|. There are two possible cases. 

(i) f is differentiable at c. If x is near enough to c, we have on the one 
hand that f(x) > f(c) since f’(c) > 0, and on the other hand g(a) < g(c) 
since « > c; whence, for these x, f(x) > f(c) > g(c) > g(x) and so x € E; 
absurd since sup(E£) =c < a. 

(ii) f is not differentiable at c, ie. c € D. For c < x < Bb, one has 
f(x) < g(x) because x ¢ E. Since c < x, we have g(x) < g(c) — r(c) by (2). 
Thus f(x) < g(c) — r(c). But g(c) < f(c) since c € E, see (1). Consequently 


c<a<b== f(z) < f(d -r(o; 


since r(c) > 0 because c € D, this relation contradicts the continuity of f at 
the point c if c < 6, qed. 

As we have said, most authors deduce the preceding results from a state- 
ment which appears more general and which we have already met for every- 
where differentiable functions: 


Mean value theorem. Let f be a scalar function defined and continuous 
on an interval IC R; assume that f is differentiable at every point of I —D, 
where D is a countable subset of I. Then, for any a,b € I witha < 6, 


(13.3) |f(6) — f(a)| < M(b— a) 
where M < +00 is the upper bound of |f'(x)| on the interval [a, 6]. 


First assume that the function f is real and that, for x € [a, }], f’(x) has 
its values in a compact interval [m, M] (there is nothing to prove if f’ is not 
bounded in [a, b]); we show that then 


(13.4) my —z) < fly) —f(z) < M(y—z) 


fora<a<y <b. Now this means that the function f(x) — ma is increasing 
and the function f(«) — Mz is decreasing in [a,b]. The derivatives of these 
functions being respectively positive and negative outside D, we can write 
the final qed in the real case. 

The case of a function f with complex values reduces to the preceding 
case by an artifice already employed in the same context [Chap. III, n° 16, 
proof of (16.5)]: one applies the result already established for real functions 
to the function f,(a) = Re[Zf(a)], where z is an arbitrary complex constant; 
its derivative Re[zf’(x)] lies, outside D, between —M|z| and M|z| where 
M =sup|f’(x)|, so that 
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[Re[zf(b)] — Re[zf(a)]] < M|z|(b— a), 


1.e. 


|Rez[f(b) — f(a)]| < M|z|(b — a); 
it remains to choose z = f(b) — f(a) and to divide by | f(b) — f(a)|. 


Corollary. Let f be a function with complex values defined and continuous 
on an interval IC R; assume f differentiable at every point of I — D, where 
D is a countable subset of I. If f’(x) = 0 for every x € I— D, then f is 
constant in I. 


Note that this corollary does not assume that f’ is a regulated function 
even though in practice ... 


To establish the existence of a primitive, we have employed Theorem 11, 
i.e. integration theory. We are going to expound the Dieudonné method (or 
Bourbaki, Functions of a real variable) to resolve these problems without 
using this, a most instructive exercise?’. The idea of the proof is to establish 
the result for step functions, which is easy, then to pass to uniform limits. 
First we assume I = [a,b] compact, the general case being deduced easily as 
will be seen. 


Fig. 7. 


It is first of all clear that every step function y admits a primitive: using 
a subdivision of I adapted to y, one considers, on each interval [7p,, ¢,41], a 
linear function & whose value at xz is equal to that of the primitive already 
constructed on the interval [a, x,], to ensure the continuity of the function & 
constructed piecewise in this way. @ is a piecewise linear continuous function 
and it is not difficult to check by banal calculations that 


(13.5) &(v) — P(u) = a p(x)dx 


28 The rest of this n° is more a “bonus for the reader” than an essential element of 
the theory. 
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for any u,v € I. Even if one ignores, in the French or English sense, the mean 
value theorem, which would allow one to show that the preceding construction 
provides all the primitives? of y, it provides, for every step function y, at 
least one standard primitive defined up to an additive constant. This is the 
one we shall use in the course of the proof, and for good reason, since there 
is nothing else! 

With this convention, by (5) 


(13.6) |P(u) — P(v)| < |lellz-lu — o| 


for any u,v € I. 

To pass from this to the case of an arbitrary regulated function f one 
chooses a sequence of step functions y,, converging to f uniformly on I = [a, }] 
and, for each n, a primitive @, of yy; since we obviously want the @, to 
converge to a primitive of f, it is prudent to impose the condition ®,(a) = 0. 
We shall show that the functions @, converge uniformly on I to a primitive 
F of f, which will prove its existence. 

Now the piecewise linear function ®,, = ®, — ®, is a primitive of the step 
function Ypq = Pp — Yq. By (6) for v = a we have 


(13.7) IPpq(u)| S Iepqllz-(u — a) S MD) |lPpqllz 


for every u € I, whence ||@, — @4||r < m(J)||~p — Yal|z, which shows that 
the @,, satisfy Cauchy’s criterion. Hence their uniform convergence to a limit 
function F’, continuous like the @,,. 

We must now prove that F’(x) = f(x) outside a countable set of points 
of I. Of course there is Theorem 19 of Chap. III, n° 17, but that assumes 
the y, to be differentiable everywhere, which is not the case here except 
outside a finite set D, C I for each n. Outside the union D of the D,, one 
may imitate the proof of the theorem in question, Theorem 19, the essential 
being, in the present notation, to show that at every point t ¢ D one has a 
relation analogous to formula (17.4) of Chap. III: 


@,,(x) — Oy (t) : 


oI 


(13.8) lim lim Pn(z) = Pnlt) = lim lim 
wt noo ce—t n—0co «sot e—t 
on the left hand side the limit over n is [F'(x) — F(t)]/(a—t), so that the limit 
over x, if it exists, must be F’(t); on the right hand side the limit over x is 
the derivative in t of ®y, i.e. Yn(t), which exists since we are outside D and 
a fortiori outside D,, so that the limit over n is f(t); whence F’(t) = f(t) 
modulo proving (8). 
We have to argue as in Chap. III, i.e. to apply Theorem 16 of n° 12 on 
the “limits of limits”. Again we put (for a given t) 


2° Note that while it is easy to construct a primitive of a step function “without 
knowing anything”, to show that it is unique up to a constant reduces, even in 
this particularly elementary case, to proving that a function with everywhere 
zero derivative is constant. 
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Cn =©,,(t),  Un(w) = [Gn(x) — Pn (t)]/(@— 4), 
ux) = [F(@)- FO|/(@- 4) 


and work on the set X = IJ — (DU {t}) obtained by omitting from I on the 
one hand the points of D where the &,, are not all differentiable, on the other 
hand the point t where the quotients (8) are meaningless®?. We know that 
Un(x) tends to cy, when x — t, and that u,(a) tends to u(x) for every x € I 
when n — +00. To deduce that the c, tend to a limit c and that u(x) tends 
to c when x — 1, it is enough to show that the u, converge to u uniformly 
on X and, for this, to verify the corresponding Cauchy criterion. But again 
putting ®pg = &p — By, Ypq = Yp — Yq, the general relation (6) shows that 


|P,4(x) i Pyq(t)| < lPpallr-|z —t= len =: Yql|z-|e — tl. 


Since clearly 
Up(@) — Uq(X) = [Ppq(x) — Ppg(t)]/(w — ¢), 
it follows that 
|up(x) — Ug(x)| < [lop — Yallz for every x € X, 


and thus ||up,) — ug||x < ||~p — Yq||r- Hence the uniform convergence of the 
Un on X. The relation (8) is thus justified at every point where the limits 
appearing on the right hand side of (8) exist, i.e. on J — D. 

Since, what is more, the relation (5) is valid for every yp, it is clear that 
it is also valid for f and F' by passage to the uniform limit. This finishes the 
proof in the case of a compact interval I. 

The case of a noncompact interval J reduces to this immediately. One 
chooses a point a € I and writes J = I, where the I, are compact intervals 
containing a and such that [, C In41. On each I, the function f has a 
primitive F,, such that F,,(a) = 0, unique since F’ = 0 implies F = const 
even if F’(a) does not exist on the points of a countable set. Thus F,, = Fr+1 
on I, for any n, whence the existence on J of a function F' which, on each 
I,, coincides with F,,. It is clear that F' satisfies Theorem 9 bis, etc. 


14 — Convex functions; Hélder and Minkowski inequalities 


Let a and 6 be two distinct points in a Cartesian space R?. A point « € R? 
lies on the line joining a and 0 if and only if the vector x — b is proportional 
to the vector a — b, ie. x —a=t(b—a) or 


(14.1) x =(1—t)a+tb 


3° Note in passing the usefulness of defining uniform convergence for functions 
defined on any set, and not only on an interval of R. Nor is it any more difficult 
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for some t € R. Since the points a and 6 correspond to the values 0 and 1 
of t, we conclude that the points of the line segment [a, }] joining a to b are 
obtained for ¢ € [0,1]. One might even consider this statement as a definition 
of [a, BJ. 

We have said (Chap. III, n° 10, Example 1) that a subset X of R? is 
convex if 


(14.2) [a,b] CX for any a,b € X. 


In R, the only convex sets are the intervals. In C, the interiors (also if one 
includes the boundary) of a circle, of an ellipse, of a triangle, of a rectangle, 
etc. are convex. A circular ring is not. 

One may generalise (2) and show that?! 


(14.3) tay t...+tntn EX 


for any x; € X and t, satisfying t; > 0, >t; = 1; one shows this by induction 
on n, introducing the point 


(t1 21 Ss eee tn—12n—1)/(t1 a Bear ea tn—1); 


which is in X by hypothesis, and observing that the point (3) is precisely 
ta +(1—t)a, where t=t, +... +tp-1 = 1th. 


x txt(lt)y y 
Fig. 8. 


Let f bea real-valued function defined on a convex set X C R?; its graph 
is then the set of points of R? xR = R?*! of the form (a, f(x)), and the subset 
of R?*! situated “above” the graph is the set of (x,y) such that 2 € X and 
y > f(x). One says that f is a convex function if this set is convex. One sees 
immediately that this is so if and only if 


(14.4) f(A — tha + ty] < (1 —t) f(a) + tfy) 


31 The point (3) is, in mechanics, the “centre of gravity” of the “masses” t; > 0 
placed at the points x;. When their sum is not equal to 1 one clearly has to 
divide the result by the total mass. Exercise: show that the medians of a triangle 
are concurrent. 
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for any z,y € X and0 <¢t < 1. Arguing as we did to establish (3), we deduce 
that 


for any x; € X and t; > 0 of sum 1. 
In the case where X is an open interval of R one may characterise the 
convex functions completely by differentiability properties: 


Theorem 15. The real-valued function f defined on an open interval X of 
R is convex if and only if it is a primitive of an increasing function. The 
function f is continuous®?, has right and left derivatives everywhere, and is 
differentiable outside a countable subset of X. 


C 


Fig. 9. 


Consider Figure 9. When B tends to M, the slope of MB, which decreases 
while remaining above that of AM, tends to a limit, whence the existence 
at every x € X of a right derivative f’(x) and, likewise, of a left derivative 
fi(a) < f5(xz); whence also the continuity of f. The slopes of the lines AM, 
MB, MC, BC, CN and ND increase since the point M, for example, lies 
below the segment [A, B]. Since on the other hand the slope of MB is less 
than that of CN, itself less than its limit f/(y) when C tends to N, so 


(14.6) f(x) < fala) < fy) S faly) fora <y; 


the functions f’ and f’, are therefore increasing??. On letting y tend to x in 
(6) one finds 


(14.7) fi(@) < fala) < fe(a+) S falat); 
on letting x tend to y, one finds likewise 


3° This is not necessarily the case if X is a non-open interval. On X = [0,1] the 
function equal to 0 for 0 < x < 1, and to 1 at the end-points, is convex. 
33 For functions of class C1 the reader may pass directly to (9). 
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iO) fg) 7,.G) = says 
which, applied to x, allows us to complete (7) as 
f(a—) < fala—) < fe(a) < fala) < f,(e+) < fa(et). 


Now we have f'(y) < fi(z) for y < z by (6); on letting y and z tend to «+ 
one finds f)(a+) < fi(a+), whence 


f(t) = fala+) and likewise f{(w—) = fg(a-) 


for every 2; finally 


(14.8) f(a) = falw@—) S file) < fala) < fi(a+) = fa(et). 

The functions ff and f’, being increasing, and so regulated, can be discon- 
tinuous only on a countable set D of points of X (Chap. III, n° 12, Corollary 
to Theorem 16 or n° 7, Theorem 6 of the present Chap.); (8) shows more- 
over that the points where they are discontinuous are the same for the two 
functions. Outside D, the six terms of (8) are equal, and f is differentiable 
since its right and left derivatives are equal. 

The function f being continuous, and admitting, outside D, a derivative 
equal, at one’s choice, to the regulated function f{ or to f4, it must be a 
primitive of f{ and of f’ in the sense of n° 13; in other words, 


y y 
=f Kuan fo fiw 
for any x,y € X. 


Conversely, let us start from an increasing, so regulated, function g on X 
and let f be a primitive of g. Consider two points x, y of X and suppose for 
example x < y. For t € [0,1] one has 

f[ — tha + ty] — {1 — t) f(a) + tf@y)} = 
(14.9) = {fle + ly —2)] — f(e)} - ify) — f(a) = 


( 
=-[- “tale +teyde—e f * g(a + v)dv 


as one sees on differentiating the functions f(a+tv) and f(a+v) with respect 
to v and applying the FT. But since g is increasing, and since v > 0 on the 
interval of integration, we have 


g(a + tv) < g(a+v) for t € [0,1]. 


The difference between the two last integrals is thus < 0 so the function f is 
convex, qed. 
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Corollary 1. A differentiable function f is convex on an open interval if and 
only if its derivative is increasing. A twice differentiable function is conver 
if and only if f’ (a) > 0 for every x. 


For if f’(x) exists everywhere, and is increasing, so regulated, then f is a 
primitive of f’ and is therefore convex. If f’ is differentiable it is increasing 
if and only if f”(a) > 0 everywhere, by the mean value theorem (Chap. III, 
n° 16). 


The case (a bonus for the reader) of a function f defined on a conver 
open subset X of R?, for example of C, reduces to the preceding if 
one assumes f of class C? on X. In the first place, the convexity of f 
obviously means that, for any x,y € X, the function tr f[(1—t)at+ 
ty], defined on a convex open subset (i.e. an open interval) of R, is 
convex. Since f is of class C!, this function has a derivative equal to 
d 


qi li —taitin,...,(1-t)xpt+typ] = >) Dif [(l-2)2+ty](yi— 21) 


by the chain rule of Chap. III, n° 21. The result obtained is again 
differentiable if f is of class C?, with, for the same reason, 


(14.10) = I D,Difl0.— Ha + tly — Ye — 29. 


dt? 


By Corollary 1, this result must be > 0; in particular, we must have 
(take t = 0) 


(14.11) N° Dj Dif (x)uiu; > 0 


whenever the u; can be put in the form y; — x2; for some y € X. 
But since X is open these differences can, for x given, take all values 
sufficiently close to 0, so that, for arbitrary u; € R, (11) must be 
satisfied by the tu; for every t € R sufficiently close to 0; since this 
substitution multiplies (11) by t? > 0, we conclude that (11) must 
be satisfied for any u; € R: the quadratic form (11) must be positive 
semidefinite, as one says in algebra. Conversely, it is clear that if (11) 
is satisfied for any « € X and u; € R, then the derivative (10) is > 0 
so long as (1 — t)x + ty € X; the function 


tro fl — t)ax + ty] 
is thus convex, so f is too. Consequently: 


Corollary 2. Let f be a real-valued function defined and of class C? 
in a convex open set X C R?. Then f is conver on X if and only if 


ye D,D; f(x)usu; > 0 
for anyu; € Randae X. 


70 V — Differential and Integral Calculus 


Theorem 15 allows us to establish the famous inequalities of Hélder and 

of Minkowski, which are not very useful at our level — but general culture 

. The function e” has an everywhere positive second derivative and so is 
convex on R, whence 


ete+(—t)y < te’ + a = te’ 
for any x,y € R and 0 <t <1. We can also write this in the form 


(14.12) a’b'* < ta+(1—1t)b 


for a,b > 0 and even for 0 < a, b < +00 on agreeing that 0° = 0, (+00)' = 
Foo, t.(+00) = +00 for t > 0 and that 0.(+00) = 0. 

Now consider an arbitrary set X and suppose that to every function f 
defined on X and with values in [0, +00] one has associated a number pi*(f) 
possessing the following properties**: 


(IS 1): O0< p*(f) < +00; 

(IS 2): the relation f < g implies p*(f) < u*(g); 
(IS 3): w*(f +9) < w*(f) + w*(g) for any f, 9; 
(IS 4): p*(cf) = cu*(f) for every constant c > 0. 


If F and G are functions on X with values in [0, +00] one has 
F'a'* <tF+(1-t)G 
by (12); whence, using (IS 2), (IS 3) and (IS 4), 
m(PGM) << ptt +(1-8G] <P) +710 -9G| 


= 
< tu'(F)+(1— t)u'(G) 


for 0 <t <1. In particular we see that 
pw (F) = p*(G) =1 = (FG?) <1. 


If one assumes only that p*(F’) and yu*(G) are finite and nonzero, and if 
one applies the last result to the functions F/y*(F) and G/y*(G), which 
is legitimate by (IS 4), the function F*G!~* is divided by the constant 
u*(F)'u*(G)1~£, whence, using (IS 4) again, 


(14.13) we (FIG) < pX(Pyn(G)*, 


Now put 


34 These conditions generalise the properties of the upper integral established in 
n° 11 and are met in the general theory of integration. It is not really necessary 
to assume that pu*(f) is defined for every positive function on X; it is enough 
that the formulae that we are going to write should be meaningful. 
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(14.14) Np(f) = ut ([fP)'/? < +00 


for every function f with complex values on X and every real number p > 0. 
If f and g are two such functions and if p and q are real numbers > 0, let us 
choose F' = |f|?, G = |g|4 and t= r/p, 1 —t = r/q where r > 0 is such that 
r/p+r/q=1, ie. such that 


(14.15) 1/p+1/q=1/r. 


Then F¢G!—t = | f|P"/?|g|2"/4 = | fg|", so that (13) becomes 


H(LFgl") <a FP)? at (1g), 
whence, on raising both sides to the power 1/r, 


(14.16) Nr(f9) S Np(f)Na(g) for 1/p + 1/q = 1/r. 


Note that N,(f) is finite if N,(f) and Ng(g) are finite because (12), applied 
to F = |f|?, G=|g|?, t=r/p, 1—t =r/q, shows that 


(14.17) lfgl"/r < |fP/p + |gl*/¢, 


so that it remains to apply the axioms (IS). 
(16) is the inequality of (Otto) Hélder, most often used only for r = 1: 


(14.18) u*(If gl) < wt (FIP) /? we (gl) "/4 = No(f)Na(9); 


this inequality assumes that 1/p + 1/q = 1 and thus p,q > 1 (conjugate 
indices). The case where p = gq = 2 is just the Cauchy-Schwarz inequality 
extended to the functions p*. 

From (18) one may deduce the inequality 


(14.19) Np(f +9) SNp(f)+Np(g) for p> 1 


of (Hermann) Minkowski, one of Einstein’s professors at the Polytechnic of 
Zurich, and, at G6ttingen in 1907-1908, the inventor of the interpretation 
of Relativity in R* endowed with the quadratic form x? + y? + 2? — c?t?. 
He would probably have gone much further if he had not died soon after ... 
There is nothing to prove if the right hand side is infinite or if the left hand 
side is zero. If the right hand side is finite, so too is the left hand side, for, 
the function x? being convex in x > 0 for p > 1, one has 


Pp 
(SE) Sts + ll?) 


This done, we write 


If tal? < (fl + lol)? = lfl-lf +91? + Iol-lf +9. 
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Since |f + g/®-)9 = |f + gl?, we have Nq(|f + glP-+) < +00; Hélder’s 
inequality (18) then shows that 


BE ta) < (NLA) + NpCO)ING(IE + 91?) 
= [NpCf) + Neto). ([f+ l?-P4) 
= [Nef + No(o)] Ht (If + 9? Y?: 


on multiplying both sides by p* (|f + gp) tp we obtain 


wf + glP)l/ = N,(f) Tp N,(g), 


ie. (19). 


It is probable that the reader has not truly understood these proofs; he 
may console himself in knowing that the majority of the professionals are in 
the same state, having only checked the argument step-by-step, registered 
the results, and forgotten the proofs for the rest of their lives (unless needing 
to expound them to students ...). 


Example 1. If f and g are regulated functions on a bounded interval I 


(14.20) [ teaeyae SNp(f)Na(f): Npo(f +9) S Np(f) + Np(g) 


nytt) = (f \ftePar) - 


the ZL? norm of the function f. You might replace the traditional integral 
m(f) by the expression 


where one puts 


af) = / honor 


where yz is a given positive integrable function; it clearly satisfies conditions 
(IS 1) to (IS 4) above. One thus finds the inequalities (20) where the symbol 
dx is replaced everywhere by pu(x)dx. Without doubt we will be made to 
observe that this “generalisation” is illusory since it follows from the classical 
case on replacing f(x) by f(a)u(x)!/? and g(x) by g(x)u(x)!/4; precisely. 
But the objection falls if one defines u(f) from an arbitrary Radon measure 
(n° 30). 


Example 2. Take for X a finite set and w*(f) = >> |f(x)|. One obtains, in 
more traditional notation, the original versions of the inequalities: 


(14.21) So nese] < (Sheet)? (Stoel) 
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(14.22) (Slax + nl?) 63 exe) ros ee)” 


Example 3. Like the preceding, but with an infinite set X and, again, p*(f) = 
>> |f(x)| < +00 for every function f with positive values. If the series \~ |x|? 
and ¥> |yn|2 converge then the series > tnYyn converges absolutely and 


(14.23) De ene lea?) (= Iyl?) 


All this assumes p,q > 1 and 1/p+1/q = 1. The case p = q = 2 is the 
Cauchy-Schwarz inequality for series, which may be proved much more easily 
by passing to the limit starting from the case of a finite sum. 
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§ 4. Integration by parts 


15 — Integration by parts 


The chain rule shows that if f and g are two functions of class C' then fg is 
a primitive of the function f’g + fg’; consequently, in Leibniz’ style, 


/ [F(xole) + Fle)g'(@)]dx = f(2)g9(2), 
which can be rewritten in the form 
(15.1) , f'(a)g(a)de = f(a)g(«) - | f(a) g'(e)de; 


this relation between primitives can be transformed into a relation between 
definite integrals: 


b b b 
(15.2) / f'(a)g(a)dx = f(a)9(a)| — / f(a)gl(w)ae. 


This is the formula for integration by parts, which allows us to calculate very 
many elementary integrals. It remains valid when f and g are the primitives 
of two regulated functions, since then the function fg is continuous and has 
as derivative the regulated function f’g+ fg’ outside a countable set of points, 
so is a primitive of f’g + fg’ (Theorem 12 bis). 


Example 1. Let us calculate the primitive 
Jrosteyae = [ros(e) tae a [eete).Coy ae = 
= log(x)a — [08 @nae = alogx — / ldz, 
whence 
(15.3) Jroste)ae = rlogxz — x. 


It is not difficult to check that log x is indeed the derivative of x log x — x. 


Example 2. We have 
[ees = 
= [elena = ge" — [(od'eae =2°e" — 5 [ wterae = 


ve® — 5ate® + 54 f wera = 


I 
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= ge —5rte” +5403" — 5.4.8 f ater = 
= awe” — date” + 5.40%e” — 5.4.37" + 5.4.3.2 [ wetde = 


= ge® —5rte” +5.40%e* —5.4.327%e" + 5.4.3.227e" — 5.4.3.2.1 i e*dx = 


= ge® —5rte” +5.4r%e* — 5.4.307%e" + 5.4.3.2xe" — 5.4.3.2.1€7. 


The reader can generalise this to x"e” for n EN. 
And for n < 0? Let’s try our luck. 


feted = 
= [coved =-e"/z + forterds = 
= -e*/x+ / log’ (x)e*dx = —e* /x + e* log x — ie log(x)dx = 


= —e*/x+e"logr— / e* (x log(x) — a]'dx = 


= —e*/x +e" logx — e* |x log(x) — a] + i e* |x log(x) — aldz, 


and you may continue indefinitely without ever managing to eliminate the 
function e* log x. It would certainly be enough to know one primitive, which 
the theory of the power series allows us to find theoretically; but in practice 
one naturally tries to express the given integral in terms of a simple combi- 
nation of “elementary”, i.e. already known, functions. This is impossible for 
the functions «*e* with s ¢ N, or e” log, or exp(exp(zx)), etc. 

Moral: one does not have to go very far to find oneself face to face with 
elementary functions whose primitives are not elementary. To be able to cal- 
culate a primitive explicitly is almost always a miracle, and even, in teaching 
analysis, a contrived miracle: the calculation put to you as an exercise is fea- 
sible because the author of the exercise knows it in advance, generally from 
having read through the older authors to extract some from the very many 
exercises of the same kind. 

In fact, mathematicians who have sought in vain to calculate a primitive 
or a definite integral of a function which is important to their applications — as 
happens often in mechanics or in physics — always end up changing their tac- 
tics: they give a name once and for all to the mysterious primitive or integral 
in question, and, instead of calculating it, try to establish its useful proper- 
ties (differential equation, series expansions, asymptotic behaviour, integrals 
linked to these functions, etc.); these are the special functions. At the lowest 
level, this is what has always been done for the trigonometric functions: one 
gives them a name, and, instead of calculating them from beautiful simple 
algebraic formulae which do not exist, one derives their properties. This is 
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also what the theory of the elliptic functions has allowed mathematicians 
to do for the primitives of the functions of the form P(a)!/?, where P is a 
polynomial of degree 3 or 4. 


Example 8. Let us try to calculate the functions 
(15.4) I, = pa cos «.dx, In= ie sin z.da. 
We may write 
Ls ye sin’ v.dx = x" sinx — eee sin x.dx 
and continue. It is more economical to observe that 
In +tdn = poretas 


and to calculate as in Example 2 or, for good measure, to calculate [ xe!" dx 
for every t € C. We find easily 


(15.5) [orerae = 
=e fa" /t— na" /P? + n(n 1a? /P +... (-1)"nl/tt"] , 


whence, for t = 7, on separating the real and imaginary parts, 


(15.6) I, = 2«"sne+n2"—'cosz—n(n—1)2" *sinz — 
—n(n—1)(n—2)a2”* cosa+..., 
(15.6”) J, = —a"™ cosa+na"'sing + n(n —1)2"? cosz — 


—n(n—1)(n—2)2"3sing —.... 


Example 4. Put log? x = (log x)? and calculate 


dt oy; 1 1 
[eoe? vdx = [G#) log? x.dx = at log? 2 — 5 | (oe? een 


2 
1 
2 
= ia log? « — =x? log x + 
2 2 
1 
} 
1 


1 
x? log? « — xe logxa + [ote = 


é 1 
= ee og? x — 52 loga + x? /A, 
One should not forget that an arbitrary constant may be added to the result 


in all these formulae. 
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16 — The square wave Fourier series 


We are now in a position to justify the square wave series [Chap. ITI, eqn. (2.4) 
and n° 11, Example 1] 


(16.1)  s(%) = cosa — cos3a/34 cos5a/5—...=7/4 for |a| < 1/2 


which Fourier first obtained by breathtaking meaningless calculations; its 
value for 7/2 < |x| < a, namely —7/4, follows from (1) on replacing x by 
a — «x. Fourier himself did not mention this explicitly in his memoir, but one 
has to believe that he had some doubts since he later felt the need to justify 
his result by patently more reasonable methods. We shall follow him. 

We start quite naturally by calculating the partial sum 


(16.2) s,(x) = cosx — cos32z/3+...+(—1)"~* cos(2n — 1)2/(2n — 1), 
not by an impossible direct calculation, but by differentiating it. We have 


s,, (x) = —sinz +sin3x —...+ (—1)"sin(2n — 1)a, 


whence, putting q = e’” and using Euler’s formulae, 

ise) = =e )—(C-e )ttoye a e 

q{l-@+...+(-1)” Laat | —q 'll-q r+ ..,] = 
set Gare ANS cet Gel AC Ca 


- T— (—1)"q?” _ (-1)"q-?" au 
™ ong Erg eT 
ret gr = que" 4 
= 1)" = (-1)""'2isin 2nx/2 cos x. 
(ay 8 = (-1) / 
Thus n2 
SsIn 2NL 
/ = —1 n . 
ae) 2cos x 


This calculation assumes cos x ¥ 0, which is the case in the interval of inter- 
est. 
From this we deduce, by the FT, that 


¥ sin 2nt 
16.3 aly) — n(x) = (—1)” 
(16.3) sn(y) — sn(e) = (ar 
for x and y in the open interval [ =] — 7/2, 7/2[. Since the sum of the series 


is supposed to be a constant function we ought to show that the difference 
(3), which tends to s(y) — s(a), tends to 0 as n increases indefinitely. 
This is not obvious at first sight, but let us integrate by parts; we get 
ee he tae int 
/ cos 2nt _ dt. 
a 2S ¢ cos“ t 


a sin 2nt —cos2nt 
zx 


cos t ~ 2n.cost 
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As n increases indefinitely the first term of the right hand side tends to 0 
because of the factor n in the denominator. Likewise for the second term, 
for, in the interval of integration, we have 


|cos 2nt.sint| < 1, cos? t > m 


where m is the minimum of cos?¢ between x and y; it is > 0 because a 
continuous function attains its minimum on a compact interval, and the 
interval [x,y] contains no odd multiple of 7/2. The factor 1/2n therefore 
provides an upper bound O(1/n). 

The sum s(x) of the series (1) is therefore constant on the interval 
|x| < 1/2. Its value must be 


(16.4) s(0) =1-1/3+1/5-...= 7/4, 


Leibniz’ series. 
We proved (4), more or less, in Chap. IV, n° 14 using the series expansion 


(16.5) arctan y = y — y°/3+y°/5—...; 
one obtains this by starting from the formula 
arctan’ y =1/(1+y’)=1l—-y’+y*-... 


and applying the method valid for every power series. Nevertheless one must 
be aware of the fact that this applies to the interior of the disc of convergence, 
ie. for |y| < 1; now Leibniz’ formula corresponds precisely to the value y = 1, 
where arctan y = 7/4. To justify this one has to argue as we did @ propos the 
series log(1+ 2), which, for x = 1, yields the series log 2 = 1—1/2+1/3-... 
by passage to the limit: for 0 < y < 1 the series (5) is alternating and has 
decreasing terms, the difference between its total sum and its n-th partial 
sum is thus majorised by its n-th term, so by 1/(2n + 1) for any y, whence 
one concludes that the partial sums of (5) converge to the total sum wni- 
formly on the closed interval [0,1]. The total sum of the series is therefore 
a continuous function there; likewise for the function arctan y; now, if two 
functions are continuous for y < 1, and coincide for y < 1, then they remain 
equal at y = 1. Whence (4). 


Starting from the series (1), one may obtain others by integration. Cal- 
culating formally, 


(16.6) i s(t)dt = sin x — sin3a/3? + sin5z/5? —... 
0 


and since s(t) is astep function, it will not be difficult to calculate the integral 
directly. But first one has to justify integrating the series (1) term-by-term. 
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Recall (Chap. III, n° 11) that the square wave series, though not uniformly 
convergent on all R since its sum is discontinuous, converges uniformly on 
every interval [—7/2 + r,7/2 —r] with r > 0. The formula (6) is therefore 
legitimate for |x| < /2—r, so for |x| < 2/2 since r is arbitrarily small. Since, 
moreover, s(x“) = 7/4 on this interval, the integral is equal to 7x/4. Whence 


(16.7) sing —sin32/3"+sin5a/5? —...= 72/4 for |x| < 1/2. 


The series is this time normally convergent on R because it is dominated by 
the series 5>1/(2n + 1); its sum is thus continuous everywhere, and this, 
by passage to the limit, justifies the value 77/8 which (7) attributes to it for 
c=7/2. 

For 1/2 < |x| < 32/2, one puts « = y +7, which brings one back to the 
preceding case: 


(16.8) sing —sin32/3?+sin52/5°—... = a(m—2)/4 
for 7/2 < |x| < 37/2. 


The left hand side being periodic, it is unnecessary to continue the calcula- 
tion; it would be better to sketch the “curve”, displayed by Fourier himself: 


Fig. 10. 


Note that for « = 71/2 we again find the relation 
1+1/3? +1/5°+...= 27/8 


obtained in n° 5, eqn. (5.10) by applying the Parseval-Bessel formula to the 
square wave series even though we had not yet justified formula (1). 

Let us continue on the same track. Being normally convergent the series 
(7) can be integrated term-by-term, for example on [—7/2, x], which produces 
the series — cos x + cos3x/3° +... . For —7/2 < x < 1/2 the formula (7) 
yields an integral equal to 7x?/8 — 73/32, whence 


(16.9) cosxz—cos3z/3°+...= 


a(n /2 + x)(n/2—2x) for |2| < 7/2. 
For x = 0 one finds 


(16.10) 1-1/3? +.1/5° —... = 27/32. 
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On continuing to integrate the reader would find infinitely many other 
formulae of the same kind, the first already in Euler, by other methods, of 
course. One may also examine what the Parseval-Bessel formula of (5.6) or 
(5.7) of this Chap. V may provide; there is no problem since the successive 
primitive series of the square wave series are (more and more) absolutely 
convergent. This kind of exercise is appreciably more instructive than calcu- 
lating a primitive of a rational fraction chosen only to make the candidate 
stumble over calculations of no other interest. 

17 — Wallis’ formula 
Let us put 
m/2 
(17.1) Li i sin” x.dx 
0 
for n EN. First, 
(17.2) Ip = 1/2, i, =1. 


Next, integrate by parts, whence 
m/2 
 —— -| sin”! x. cos’ x.dx = 
0 
n/2 m/2 
= -sin"! x. cos a|, +f (sin"—' x)’ cos x.dx = 
0 
n/2 
= (n- yf sin”? x. cos? a.dx = (n — 1)(In_2 — In). 
0 


Finally we obtain I, = In_2(n — 1)/n, whence 


(2n — 1)(2n —3)...1 


Tp I, ith Ip = 7/2 
¢ ena Oe ay 
and 
I _ 2n(2n — 2)...2 I ahh =1 
el eens er 


Since 0 < sinx < 1 on the interval of integration it is clear that I,41 < In, 
thus that Ton41 < Lon < Ion—1, whence 


1 < Ton /Tongi < Lon-1/Tong1 = 14+1/2n. 
The ratio 


(2n + 1)(2n — 1)?(2n — 3)?...3? @ 
(2n)2(2n —2)2... 2? 2 


Lon /Ton41 = 


thus tends to 1, whence Wallis’ famous formula 
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2 13.3.5.5.7...(2n—1)(Qn+1) 
17. = 
Fs) ee 2.244.6.6...2n.2n 


Fourier, who used it, wrote it in the form 
2 
— = 3.3.5.5.7.7..../2.2.4.4.6.6....3 
T 


though neither the numerator nor the denominator makes any sense, one 
may try to interpret it as an infinite product (Chap. IV, n° 17). In the form 
3/2.3/2.5/4.5/4... it is clearly divergent since the general term is of the form 
(p + 1)/p = 1+1/p. One may then try the product of all the expressions 
3.3/2.2, etc., ie. the infinite product with general term (2n + 1)?/4n? = 
(1+1/2n)?; but 


(1+1/2n)? =1+1/n+1/4n? =1+ un 


where the series }>u, is divergent since u, ~ 1/n; new catastrophe. One 
might prefer the groupings 3.3/4.4, etc., which leads to the product of the 


(2n+1)?/(2n4+2)? = [1 —1/(2n + 2)? = 1-1/(n+1)4+1/4(n+1)? = 1+un, 


which diverges as much as the preceding ones, and for the same reason, 
namely that up, ~ —1/n. The solution is to write (3) in the form 


133.5 5.7 (2n—1)(2n+1) 


oo ae 6g 2n.2n 
or a8 
(17.4) 2/m = [] (1-1/4?) = (1 - 1/4)(1 — 1/16)(1 — 1/36)... 


The infinite product is this time absolutely convergent like the series > 1/n? 
and is greatly preferable to the formula (3), which would lead you to perdition 
as we have just seen. 

A very fast proof Wallis’ formula comes from starting from the infinite 
product 


sin 72 = rx [JQ — 2? /n?); 


one recovers Wallis when x = 1/2. 
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§ 5. Taylor’s Formula 


18 — Taylor’s Formula 


Chap. II, n° 19, has shown us that if a function f(z) is analytic on an open 
set in C containing a point a, then the power series which represents it on a 
neighbourhood of a has the form 


(181) f(z) = fla) +f @(e-a)/1+ f"(@(z—a)?/ 4+... 


This result applies in particular to functions which are defined and analytic 
on an interval in R. 

One obviously cannot hope for so much for every function defined on such 
an interval, even if it is indefinitely differentiable, since all the derivatives of 
such a function may well be zero at a without the function being zero on a 
neighbourhood of a, as Cauchy showed (Chap. IV, end of n° 5). The situation 
is still more desperate when we deal with functions which are not indefinitely 
differentiable. 

But let us start from the formula 


fol = [ rious fo froma 


where Po(t) = 1, and let P,(t) be a primitive of Py. An integration by parts 
shows that 


nO [repo 


If P(t) is a primitive of P, and if one integrates again by parts, one finds 


= f'(t)Pr(t) 


a 


x 


= POP) — f'O Pat) 


a 


f(t) 


i te / ” p(t) Pa(t)dt. 


If f is of class C’+!, i.e. has continuous derivatives of order < n+ 1 on 
the interval J of R where it is defined, one may continue up to the integral 
involving f("+), The result is clearly that, if one chooses polynomials P(t) 
satisfying 


(18.2) Ph=P,4, Po=1, 
then 

fo] = FORM fORO+..+(D FORM! + 
(18.3) i (=1)" y fF (t) P, (t)dt. 


Since we would like to express f(x) in terms of its derivatives at a and of 
the factors (a — a)*, we ought to choose the P;, so as to make the terms 
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fi(x),..., f(x) that appear in the integrated parts disappear from the 
right hand side of (3); to do this we have to choose the P;, to vanish for 
t = a, in other words 


(18.4) P,(t) = (t — x)*/k! = (t— 2)", 
After this calculation we obtain the following result: 


Theorem 16. Let f be a function defined and of class C"* on an interval 
I of R. Then, for any a,x € I, 


(185) fle) = fla) + f(@(e—a) + f"(a)(a— a) +... 
et f(a) (x — a)! + ry (ax) 


where 


(18.6) n(x) = i : f(A) (e — at. 


This is Taylor’s formula with integral remainder for functions of class C"*?, 
and is due, with this very proof, to Cauchy. 

An expression for the remainder that is sometimes more useful comes 
from putting « = a+h and replacing the function f(x) by the function 
g(u) = f(a+ uh), where wu now varies in [0,1]. The limits a and x become 0 
and 1 and we have g‘)(u) = h? f)(a+ uh) by the most trivial form of the 
chain rule. On applying the formula to g between u = 0 and u = 1 one finds 
the following version of Taylor’s formula for /: 


(18.5’) f(ath)=f(a)+f'(@ht+...+ f™(a)h"/n! + rn(at h), 
where 


prt 1 
(18.6’) Tah) = a | f(a + uh) (1 — u)"du. 
* JO 


The simplest and most useful particular case is 


1 
(18.6) fle +h) — fla) — f'(a)h= i f P@tana= wan. 
0 
For a = 0, (5) becomes a formula of Maclaurin type: 
(18.7) f(x) = FO) + f/O)a+ f"(0)a?/2! +... + f° (O)a/nl + rn(a) 


with 


I 


(18.8) rn(2) i : FOV (a —H dt = 
grt 


= : f°) (ux)(1 — u)" de. 
0 


n! 
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This expresses f as the sum of a polynomial 
p(x) = f(0) + f'(O)a +... + f(O)al 


of degree n having at x = 0 (or, in the general case, at « = a) the same 
derivatives of order < nas f, and a “remainder” given by (8). It is called the 
“remainder” because, if f is analytic, r,(a) is actually the n-th remainder of 
the Taylor series of f; but here there is no series. Moreover, when one speaks 
of the partial sums and of the remainders of a series, one very much hopes 
that these will be increasingly negligible as n increases. 

The true situation is less impressive. If you apply (7) to an indefinitely 
differentiable function all of whose derivatives are zero at the origin, you will 
find f(x) = ry(x). Far from being negligible, the “remainder” is then the 
predominant term. 

All the same, one can estimate it. Since f("+) is continuous it is bounded 
on every compact set contained in the interval J where f is defined. If then 
x remains in a compact interval K contained in J and containing a (for 
example, an interval of centre a if a is interior to J), and if, as always, one 
puts 
(18.9) PP ic = sup [Ff @)], 

cek 


then |f@*) (¢)(a—t)"| < || f*?||,,-|(a@—#)”| on the interval of integra- 
tion; but on this interval we have |(a — t)”| = +(a — t)” with the + sign if 
x > aand the — sign if x < a, in which case the + sign is reestablished from 
the fact that the oriented integral from a to x is accounted negatively. The 
function t +> (# — t)!"] having as a primitive the function 


tro —(a@ — tir 
which vanishes for t = x, one finds finally that 
(18.10) |rn(z)| < || #*? || ,-le - al" /(n +1)! for 2 € K. 


(6) makes the result even more obvious. 
Modifying the notation, in particular we have 


(18.11) flath) = flat f(aht fran" +... 
wtf (a)nl”l + O(n"**) 


as h tends to 0. If, for example, a function f of class C'° vanishes together 
with all its successive derivatives at x = a, then 


f(z) =O((a@—a)”) for every n 


as x tends to a, which shows that, even in this case, Taylor’s formula does 
provide information: f(a+h) tends to 0 more rapidly than every power of h. 
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We say then that the function f is flat at the point a. The aeroplane which 
will never be built follows a trajectory of this kind on landing. 
Exercise. Prove that 


p(h) = f(a) + fi(@at...+ fan 
is the only polynomial satisfying 


f(a+h) =pla)+o(h"), — @(p) <n. 


These results assume that f("+) exists and is continuous. One can in 
fact prove the formula otherwise, without this last hypothesis, in the case 
of a real-valued function; the practical usefulness of the result in relation to 
the preceding one is weak, to speak moderately, but the proof is ingenious. 
To simplify we restrict to the Maclaurin case, and, for x = b given, let us 
consider the functions 


gt) = fe)-fO-fOO-H-...-fPOO-H™, 

A(t) = g(t) — g(0)(o— ah orrn 
like Cauchy and his uniformed students, most of whom were probably doz- 
ing. By hypothesis, g, and so h also, have first derivatives, not necessarily 
continuous, between 0 and b. We have h(0) = 0 by a direct calculation, and 
h(b) = g(b) = f(b) — f(b) =0. Therefore there is a number € between 0 and 
b where h’(€) = 0 (mean value theorem, or even Rolle’s theorem, Chap. III, 
n° 16). Now a calculation similar to the one that led us to (3) shows that 


g(t) = —f MY H(_- 1), 
whence 
h'(t) — fF) (4)(b — tl + g(0)(b— 1 oir 
= [9(0) = fH gyplrt] (6 — gym) ple, 


I 


Since h/(€) = 0 we therefore have g(0) = ft) (€)olr+4] for a € between 0 
and 6 = x, and since 


9(0) = f(a) — {f0) + FOe+...+/™ Ozh}, 


finally we have 


(18.12) f(a) = f(0) + fat... + fall + fO+D (Chalet, 


this is Maclaurin’s formula with Lagrange remainder. In the general case one 
obtains 
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(18.13) flath) = fla)+f(a)h+ f"(ahel +... 
ot Ais faa yf fe M (a af Oh)Alrt4 


where, yet again, f has real values, is (n+ 1) times differentiable, and where 
the number traditionally denoted by 6 is between 0 and 1 so that the point 
€=a+0h lies between a and z=a-+h. 

Applied to a function of class C® these results sometimes allow us to 
pass to a Taylor series. 


Example 1. Take a = 0 and f(a) = sinx. For n = 2p one finds |r,(x)| < 
|a|?? /(2p)! since the successive derivatives are everywhere less than 1 in mod- 
ulus. On passing to the limit one thus recovers the formula 


sin = lim [x — 2° /3!+...4+ (—1)?> a7? "1 /(2p — 1)!] . 


This argument can be generalised. For a C™ function the Taylor formula 
with remainder applies for any n. To pass from this to a power series expan- 
sion, it suffices — but this is the crucial point — to know that the remainder 
tends to 0 as n increases indefinitely, in other words to have a suitable es- 
timate of the successive derivatives. If for example there exist constants MW 
and q such that |f‘(«)| < Mq” for any n, then the formula (10), for a = 0, 
shows that 

Irn(z)| < Mq”|a|"/n! = M(qlal)” /n! 


and the passage to the limit is justified. In other cases one has to argue in 
some other way. 


Example 2. Take f(a) = e”. All the derivatives are equal to f, so that, for K 
given, the factor Wester I in (10) does not depend on n. Thus limr,,(a7) = 0 
for every x and one recovers the power series for the exponential function. 


Example 3. Take f(x) = (1+ 2)* with s € C and —1 < x, whence 
fF) (e) = (8-1)... (sn) tayo". 


Here, by (8), 


1 
if (1+ ur)?" (1 — u) "dt, 
0 


or 


“4h fe Demet fa towe ( l—-u ) du 


1+2zu 


For « > —1 we have 1+ zu > 1-1 on the interval of integration, so that 
0 < (1—u)"/(1 + au)” < 1 for every n. The modulus of the integral is 
therefore less than a number independent of n. Since the series with general 
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term s(s — 1)...(s — n)x"*1/n! converges for |x| < 1 (use the Un4i1/Un 
criterion, this is not exactly the binomial series), its general term tends to 0. 
Likewise therefore for r(x). On passing to the limit in Taylor’s formula of 
order n one thus recovers Newton’s series 


(1+2)° =1+4s¢+s(s—1)27/2)+... 
for |z| <1, x real, s complex. 


We said above that if f is of class C"*! then the remainder in the Taylor 
formula is O ((x oe ayee)) as x tends to a. One may improve this result for 
f real, assuming f("+) (a) 4 0. The ratio rp (x) /(@ — a)"t! = f+ (€) then 
tends to f("+1)(a) since the derivative of order n +1 is continuous and € lies 
between a and x, so tends to a. If one assumes f‘"+) (a) 4 0, then, in the 
Maclaurin case, for simplicity, 


(18.14) f(#)— f(0) — f'O)a—...— FM O)al"l ~ FOV (Ogi 


as x tends to 0, the ~ sign signifying, we recall, that the ratio between the 
two sides of (14) tends to 1. 

This formula sometimes allows one to calculate the limits of quotients 
f(«)/g(x) as x tends to a point a where f(a) = g(a) = 0. For f and g of 
class C”, suppose that we have 


i@ = f@=.=f" Os 
ga) = g(a)=...=9g"YV(a)=0, g(a) £0. 


On a neighbourhood of a one has g(x) ~ g(™(a)(a — a)!"). If f™(a) 4 0, 
then likewise f(a) ~ f'”)(a)(a — a)!"). Consequently 


(18.15) lim f(2)/g(x) = f(a)/g™ (a). 


If f™ (a) = 0 then f(x) = o((x—a)") and the result remains valid. For 
example, to find the limit of the ratio 


tanzx—2x 
x—sing 


as x tends to 0. Here f(0) = f’(0) = f”(0) = 0, f’”(0) = 2 and 
g(0) = g'(0) = g”(0) = 0, g’”(0) = 1. The limit is therefore 1/2. 

This is the famous l’H6pital’s rule (he proved it for n = 1, the general 
case being apparently due to Maclaurin), named for a Parisian marquis who, 
in 1696, earned himself an enviable mathematical reputation by publishing 
a book entitled Analyse des infiniment petits, pour Vintelligence des lignes 
courbes, the first exposition to the public of what one then knew as the 
differential calculus chez Leibniz and the Bernoullis. The author made no 
mystery of the fact that he had learned everything from his correspondence 
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and conversations with them: “I have simply appropriated their discoveries 
and those of Mr. Leibnis [sic]”, he wrote, skimping the name of his hero, 
not even giving him a francophone t. This did not prevent Johann Bernoulli 
from accusing him of having plagiarised one of his manuscripts and from 
attacking him in 1742, which was 38 years after the death of the presumed 
culprit. Since Johann and Jakob Bernoulli did not hesitate to accuse each 
other of plagiarism in the years from 1695, it is probably better to reserve 
one’s judgement, as does Moritz Cantor, pp. 222-228 of his Vol. III. 
“Science is a cruel game. One shoots one’s adversaries down in flames, 
one grills one’s competitors, one demolishes one’s rivals”, Loup Verlet tells us, 
p. 97 of La malle de Newton. One should not disdain disputes over priority; 
they reveal one of the most permanent and pervasive traits of the psychology 
of scientists: the defence of their intellectual property. They are an obligatory 
corollary of the following principle: the second person to prove a theorem or 
to discover an AIDS vaccine will not enhance his reputation; he has lost his 
time, except insofar as the necessary work may have taught him techniques 
of future application. One has to be the first®®. Moreover, since scientists, 
and particularly “pure” mathematicians, derive no capital from their work 
other than the recognition of their prowess by their peers®®, those who value 
this have no choice. Those who patent their discoveries — again not the case 
of the pure mathematicians, except recently in cryptology?’ — or depend on 


35 The subject is expounded with raw clarity in an interview with a Franco- 
American biologist, Portrait of a biologist as capitalist savage, to be found in 
Bruno Latour, Petites legons de sociologie des sciences (Paris, La Découverte, 
1993). The subject had inspired Robert K. Merton much earlier. In the real world 
it happens that a new result is obtained almost simultaneously by researchers 
working independently of one another; questions of priority are of more interest 
to them than to spectators. A particularly extreme case is described by Nicholas 
Wade, The Nobel Duel (Doubleday, 1981). In an other domain, let us quote 
the immortal declaration of the heroic Frenchman, who, on account of a young 
German, missed first place in the cyclists’ Tour de France of 1997: “What is 
missing, I think, is the opportunity to chance on a year without someone better 
than oneself.” 

This is the function of prizes, medals of gold or of chocolate, colloquia in honour 
of, seats in academies, etc. which scientists dish out for mutual reward and 
admiration. In mathematics, it has even happened, over the last fifteen years, 
that the complete works of great men have been published before their deaths, 
despite having to add supplementary volumes later. This facilitates colleagues’ 
work, but the psychological effect on the person so honoured is probably not 
negligible. The Science Citation Index, moreover, allows every scientist to know 
how many times his papers have been cited each year. When the total exceeds a 
hundred (for Einstein, the record holder, the total even exceeded nine hundred 
some years ago), one might think that the “reward” is sufficient. 

Neal Koblitz, A Course in Number Theory and Cryptography (Springer, 2d. ed., 
1994), propounds the hypothesis that publications in certain parts of number 
theory may some day have to be submitted to preliminary censoring by the 
National Security Agency. This idea is not so wild, since, when the public key 
systems invented by mathematicians — their own enterprise — appeared some 


36 


37 
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subventions from organisations advised by scientists have no more: if the 
hormone you are trying to synthesise may bring you fifty thousand dollars a 
year for seventeen years it is urgent not to be second. 

With the passion for their profession, which, throughout history, has led 
scientists — not all, of course — to accept unworthy or mediocre conditions of 
life or work, this is the reason why you will see first class people spend sixty 
hours per week in the laboratory, for forty years, for a CNRS “director of 
research” salary, which any Polytechnic student or former pupil of an Haute 
Ecole des Etudes Commerciales Libérales et Avancées can obtain in industry 
or in a bank two years after leaving the institution. And so one lauds the 
disinterestedness of these “heroes of science” who have no choice. 

No choice? After all, as a garage owner who presented an unconscionably 
overinflated account explained to the present author (who, scion of the lower 
middle class, was not a customer of great standing and did not care to accept 
it), “with your brain, you should have chosen another profession”. Garage 
owner, for example. The garage owners could, in exchange, write the mathe- 
matics which sells itself cheap, teach the chemistry of polymers or the history 
of the Middle Ages to two hundred students every year, or try to understand 
Alzheimer’s disease. Then we might praise the disinterestedness of the garage 
owners. 

Others say that the ill-paid scientists are recompensed in that they enjoy 
themselves; this may often be true while they are working, but their families 
do not always appreciate this, and the argument does not have a universal 
validity. Marcel Dassault, the famous French aeroplane manufacturer, said, 
and repeated during his life, that what “amused him”, was to “make aero- 
planes” and one does not doubt this. This did not stop him from going to 
his office in a Rolls Royce, and this did not stop his son, who also amuses 
himself as he can and continues to make man-hunting fighter planes, from 


dozen years ago, the reactions of the NSA have been (i) to try to forbid them 
(in France, their use is subject to prior authorisation), (ii) to impose limits on 
the degree of security, (iii) to take charge of research contracts in this field. 
Remember that the NSA, created in 1952 and with an enormous budget, has 
the mission of ensuring the security of the telecommunications of the American 
government and the insecurity of those of others; see James Bamford, The Puzzle 
Palace (Houghton Mifflin, 1982) and Body of Secrets (Arrow Books, 2003). In 
“militarily sensitive” areas, access to an American thesis, or to certain courses or 
seminars, may be limited to people who have submitted to a “security clearance” 
guaranteeing their “loyalty”. In the USSR, all scientific or technical publications 
were, in principle, subject to prior censorship. Koblitz notes that, up to very 
recently, number theory had never lent itself to any application outside pure 
mathematics. The interest of some mathematicians in cryptology is however 
longstanding — Viete and Wallis for example — and one knows the part played 
during the War by the Turing team; as for those of our contemporaries who are 
involved, they do not have to proclaim it urbi et orbi. The novelty is the recourse 
to very advanced mathematics, with, necessarily, the cooperation of professionals 
in the theory, simply to understand it; the bridge players of the Enigma project 
are probably not enough. 
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setting himself as the aim of his animal hunting for 1998 to kill, from a tur- 
reted (armoured?) 4 x 4, one hundred and eighty-five head of large game in 
his modest property of eight hundred hectares near Paris, which landed him 
in trouble®®. Mr. Gates, despite his N(t)101° dollars, also seems to amuse 
himself. 


Another long trivial theme — though not in the XVII‘ century for Fer- 
mat —, the question of the maxima and minima of functions. If a differentiable 
function f has a local maximum or minimum at a point a i.e. if, on a neigh- 
bourhood of a, one has f(a) < f(a) in the first case and f(x) > f(a) in 
the second, it is clear that f’(a) = 0. But this condition does not suffice, 
as shown by the function «° at xz = 0. To elucidate the question one has to 
examine the higher derivatives. If 


P@af @so](P@)=0,f" 9 @ 40, 


then f(x) — f(a) ~ f+) (a)(x — a)I"*4] as we have seen; consequently, the 
difference f(a) — f(a) has the sign of the right hand side for « sufficiently 
near a, whence the conclusion: maximum if n is even and f(+(a) < 0, 
minimum if n is even and f("+)(a) > 0, while the graph of f crosses a 
horizontal tangent (point of inflexion) if n is odd. 


38 Le Monde of 23 April 1997; the officials of the National Water and Forests Office 
apparently do not appreciate that the methods honoured in military aviation 
since the Great War are being applied to deer and boar. 
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§ 6. The change of variable formula 


19 — Change of variable in an integral 


As we said above, the rules for calculating derivatives can be interpreted 
in the language of integrals. We have seen the interpretation of the rela- 
tion (fg)’ = f’g + fg’ in terms of primitives. The other fundamental rule 
of calculus, namely that the derivative of a composite function f[u(«)] is 
f'[u(a)]-u'(a), likewise leads to an almost trivial integral formula, but it fa- 
cilitates many explicit evaluations. It also has very important applications, 
mainly in the theory of line integrals (Vol. III). 


Theorem 17. Let u be a real function defined and of class C! on a compact 
interval I = [a,b], and f a function defined and continuous on the interval 


J =u(1). Then 

u(b) b 
19.1 dy = u(ax)lul (x)dax. 
(19.1) [,, fo [ fucow'@) 


For, let F be a primitive of f on J, so that the left hand side is equal to 
F|u(b)| — Flu(a)]. The function G(z) = F[u(«)| is differentiable on I and 
G'(x) = F"[u(a)]u’(x) = flu(a)]u’(x). The right hand side of (1) is thus 
equal to G(b) — G(a) = F[u(b)| — F[u(a)], i.e. to the left hand side, qed. 
Note that we are dealing with oriented integrals in this formula, since, 
even if a < b, one may well have u(a) > u(b). 
In the Leibniz notation for primitives one would write 


(19.2) [today = f fu 2)hu'(w)ae. 


The formula is self-explanatory: one replaces y by its expression as a function 
of x, simultaneously in f(y) and in dy (Chap. III, n° 14 and 15). This is 
one of the great advantages of Leibniz’ system, with the analogous formula 
dy/dx = dy/du.du/dz. 

One may widen the hypotheses of Theorem 17 a little when applying it 
to regulated functions, but the reader new to the subject would be better to 
keep to the very simple Theorem 17 for the moment. 

The whole question is to reassure oneself that, fF being a primitive of 
the regulated function f and u a primitive of the regulated function u’ in 
the sense of n° 13, then G(x) = F[u()] is again a primitive of f[u(x)]u’(x). 
Since G is continuous, because u and F' are, it is enough to convince oneself 
that (i) the function f[u(x)]u’(x) is regulated, (ii) G’(a) exists and is equal 
to f[u(x)]u’(a) outside a countable set of values of x. 

Point (i): assume that u’(a#) and f[u(x)] are regulated. The first condition 
is satisfied if u is a primitive of a regulated function. So is the second if f is 
continuous since then f o u is continuous; if f is only regulated one has to 
check that f ow has right and left limit values; now, as h tends to 0 through, 
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let us say, positive values, u(x +h) tends to u(x), though not necessarily in a 
monotone way; to be certain that f[u(«)] is regulated it is therefore prudent 
to assume u monotone, which is the most general case in practice. 

Point (ii): the derivative G’(x) exists provided that u’(x) and F’[u(2)] 
exist. The derivative u’(x) exists outside a countable set D and the derivative 
F'(y) exists either everywhere if f is continuous, or outside a countable set 
D’ if f is only regulated. If f is continuous then G’(x) = f[u(x)]u’ (a) outside 
D, and since, in this case, the function f[u(x)]u’(x) is regulated as we have 
seen @ propos the point (i), everything works. If f is only regulated, in which 
case we had better assume u monotone (as we have seen in point (i) for 
flu(a)]u’(x) to be regulated), the existence of G’(x) assumes « ¢ D and 
u(x) ¢ D', ie. x € DUu1(D’). If u is strictly monotone, so injective, the 
inverse image u~!(D’) is countable like D’, so also is its union with D; then 
G'(x) = flu(x)]u’(a2) outside a countable set and everything works again. 
If u is not strictly monotone there are intervals on which wu is constant, so 
also f[u(x)] and the relation G’(a) = f[u(«)]u’(x) can be written on these 
intervals as 0 = 0, which shows that it is not false ... 

In conclusion, we see that the change of variable formula is valid in the 
two following cases: (a) f is continuous and u is a primitive of a regulated 
function; (b) f is regulated and u is a monotone primitive of a regulated 
function. 

In practice, one most often assumes the function wu to be strictly monotone, 
or, almost equivalently, that its derivative is always > 0 or always < 0. On 
writing a and b for what we wrote as u(a) and u(b) in (1), we then find the 
relation 


b u*(b) 
(19.3) [ fondy= fo Fluent eae 
where u~! : J —> I denotes the inverse map of u. 
Example 1. Calculate the indefinite integral f(a? + 1)°xdz. Putting u(x) = 


x +1 we have u/(x)dx = 2xdzx, so we need to calculate 4 f u(x)3u'(x)da; 
this is situation (2) with f(y) = y?. Thus 


1 
fe + 1)32dz = 5 | vay = y'/8 with y=a27+4+1. 


To calculate the given integral between the limits « = 2 and x = 3, for 
example, one notes that, in this case, u(a) = 5 and u(b) = 10, whence 
10 


fe +1)3adx = y4/8| = (10* —5*)/8. 


5 


In practice, one puts it as follows: make the change of variable y = x? + 1; 
then dy = 2xdz and (x? + 1)? = y? and consequently 


fe +1)3adzx = 5 | vay = y*/8 = («? + 1)*/8. 
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Example 2. Let f be a real function of class C1 on an interval I, not vanish- 
ing on I. To calculate f f’(x)/f(«x).dx one performs the change of variable 
y = f(x), whence dy = f’(x)dax and 


f(x) 
chat kc 
It remains to find a primitive of the function 1/y on the interval J = f(J). 
Since f does not vanish it has constant sign on J. If it is positive, the function 
log y will serve. If it is negative, it is the function log |y| which suits, since 
for y < 0 the derivative of the latter, i.e. of log(—y), is —log’(—y) = 1/y. In 
conclusion, one finds 


(19.4) / oe dx = log|f(z)|. 


For example, 


4 4 / 
] 4 log 4 
i) a =| 25 da =log(|logx|)| = log “F~ = log 2 
1/2 ©. log x 1/2 logx 1/2 log 2 


since 4 = 2?, 
Likewise 


ic x.dx = — | cos! x/cosx.dx = — log | cos 2}, 


so long as one works on an interval where the function cos x does not vanish, 
say | — 7/2,2/2[. Whence for example 


n/4 m/A 1 
| tan x.dx = — log(cos 2)/, = — log (1/v2) =%5 log 2. 
) 


Example 3. To calculate f{ dx/sinx on an interval where the sine function 
does not vanish, for example on ]0,7/. If one is inspired, or if one has read 
all the books, one observes that 


1/sina = 1/2sin(a/2) cos(a/2) = 1/2 tan(x/2) cos?(x/2) = f'(x)/f(2) 


where f(a) = tanz/2 and f’(x) = 1/2 cos?(2/2), whence 


[axfsine = log |tan 2/2]. 


This kind of recourse to Providence will not take one very far if one has no 
general procedure for calculating integrals of the form 

2 Gpq Cos? x. sin? x 

Y= bpg Cos? x. sin? x 


(19.5) 
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with a finite number of nonzero coefficients apg and bpg, in other words a 
rational function of cosx and of sinz. The method is to perform the change 
of variable 


(19.6) x = 2arctany, y = tan(a/2). 
Trigonometry then shows that 
(19.7) sina = 2y/(y? +1), cos x = (1 — y”)/(y? +1) 


using the relations 


sin 2t = 2sint.cost = 2tant.cos?t = 2tant/(tan?t+ 1), 
cos 2t = 2cos?t — 1 = 2/(tan? t+ 1) — 1 = (1 — tan? t)/(tan?¢+ 1). 
Further, 
dx = 2dy/(y? +1). 
On substituting in (5) one reduces to calculating an integral of a rational 
function of y, which will be the aim of the following n°. 
Example 4. To calculate 
4 
1 
52 ee dx: 
\/ (a + 1)(a% — 5) 
we have to work in the interval x < —1, or in the interval x > 5 to obtain a 
real result. We have (a+ 1)(a —5) = (x — 3)? — 4, which suggests the change 
of variable x = 3 + 2y, whence dx = 2dy and reduction to 
i (2y+3)'+1 , 
Vy-l 


A second change of variable y = cosh z reduces us to 


forme 


sinh z 


sinh z.dz = z+ fe cosh z + 3)*dz 


and to calculating the primitives of the functions cosh” x, which can be done 
in several ways, the banal method — expanding as exponentials — most often 
being the best. 

If, in this example, the denominator had been the square root of a tri- 
nomial without real roots we would have put it into the standard form 
(x — a)? + b? and the change of variable « — a = by would have led us 
to (y? +1)!/?, in which case it is the change of variable y = sinh z which 
leads us to the result. 

There are also cases where, in the given trinomial, the coefficient of x? is 
<0. The same changes of variable lead this time to integrals in (1 — y?)!/? 
or in (—1 — y?)'/? = i(1 + y?)!/?. The second case is treated by putting 
y = sinh z as above. In the first the change of variable y = sin z leads us to 
the result. 
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Example 5 (Darboux, 1875). Consider the integral 


1 
| 2n?x.exp(—n? 2x”) da; 
0 


the change of variable y = n?x?, for which 2n?rdzx = dy, transforms it into 
the integral of e~¥ taken from 0 to n?. The result, 1 — exp(—n?), tends to 1 
as n increases indefinitely, although the function being integrated tends to 0. 
How do you explain this strange phenomenon which Gaston Darboux was, 
apparently, the first to discover? 


20 — Integration of rational fractions 


One does need these from time to time in real mathematics; but very rarely. 
In teaching they are useful only to (i) accustom students to algebraic calcu- 
lation, which will always be useful elsewhere, (ii) provide examiners with an 
inexhaustible reservoir of exercises built up over the generations, so enabling 
them, according to point (i), to test the candidate’s virtuosity. They may 
also be needed in certain electrotechnical calculations, for example, but this 
is surely not the principal motivation of the subject, inaugurated by Leibniz 
who was not thinking of the XIX*® and XX* century students who were 
obliged to suffer the fallout ... 

Let f(x) = P(x)/Q(a) be a rational function of x, where P and Q 
are polynomials. By using the d’Alembert-Gauss theorem which we shall 
prove later, and a few ideas from algebra, we may write @ in the form 
Q(x) = Qi(x)...Q,(a) where each of the Q; is, up to a constant factor, 
of the form 

Qx (x) = (4 — ax)"; 
the ax are the various distinct roots of Q, perhaps complex, and the integers 
ny are their orders of multiplicity, by definition. It is shown in all the algebra 
textbooks that one may write f in the form 


(20.1) Fla) = pla) +7 Pr(2) 


Lv — ag)” 


with a polynomial p, the quotient of the Euclidean division of P by Q, and 
polynomials p; of degrees < nz. On writing pz, as a polynomial in x — ax one 
finally finds a decomposition into simple elements of the form 


(20.2) f(z) = pla) + )7 Agn/(@ — ax)” 
kn 


with a finite number of constants Az, °°. The search for a primitive of f thus 
reduces to that of a primitive of the polynomial p — immediate calculation — 
and of functions of the form (2 — a)~” where n is an integer > 1. The result 


3° Let p and q be two polynomials in one variable with coefficients in K = Q,R,C 
or any other field. 
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dx 1 
208) / (a — a)” ~ (n —1)(a — a)"-1 


is 


obvious if n 4 1, but less so for n = 1. 
It is first of all prudent — even if n > 1 — to work in an interval J of R not 


containing a. If a is real, then 


d 
(20.4) - — = log |x — al ifaeR 


ax 


since, for y # 0, the function log |y| has derivative 1/y. In (4) we therefore 
have to take log(a — x) for the primitive if a is to the right of the interval I 
and log(a — a) if it is to the left. 


If a is complex the business is more complicated. 
Recall first (Chap. IV, n° 14, section (x) and § 4) that, for nonzero z € C, 


one defines the expression Log z by 


Log z= w= z=e”. 


There are infinitely many possible values, differing by a multiple of 27. 


U ptU 


Putting w = u+ iv, we have z = e“e’’, whence u = log |z| and v = arg z, i.e. 


(i) Consider the set of polynomials of the form up + vg, where u and v are 
arbitrary polynomials with coefficients in K. Among those which are not identi- 
cally zero let d = uop+vog be a polynomial of minimal degree, not greater than 
the degrees of p and q since p = 1p+ 0q, q = 0p + 1q. Every polynomial which 
divides p and q divides all the up + vq, so divides d. On the other hand, d itself 
divides p and q, because the Euclidean division algorithm yields a relation of the 
form p = du +d’, with d’ of degree strictly less than the degree of d, and the 
relation d’ = (1 — uuo)p — uvog shows then that d’ = 0 since d is of minimum 
degree among the nonzero polynomials which can be written in the form up+vq. 
In brief, d is the gcd of p and q. 

(ii) If d is constant, i.e. if p and g have no common nonconstant divisor, i.e. if 
p and q are mutually prime, one may assume d = 1 whence 


r/pq = r(uop + voq)/pg = rvo/p + ruo/q; 


every rational fraction with denominator pq is thus the sum of two rational 
fractions whose denominators are respectively p and q. More generally, every 
rational fraction whose denominator is a product p1...px of pairwise mutually 
prime polynomials is the sum of rational fractions having only one of the p; in its 
denominator: pi, for example, is prime to p2...px, which allows us to simplify 
the denominators step-by-step. 

(iii) Suppose q(X) = (X — a1)’...(X — ax)’ with pairwise distinct roots ai 
and integer exponents. The polynomials (X — ai)’ are pairwise mutually prime 
since their divisors are obvious. Every rational fraction of the form p/q thus 
decomposes as a sum of fractions of the form p:(X)/(X — ai)’. On writing p:(X) 
as a polynomial in X — a; one obtains the desired decomposition into “simple 
elements”, qed. 
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(20.5) Log z = log |z| + iarg z. 
If z= a+ 7y we conclude that 
1 
(20.6) Log z= 5 log(x? + y”) + iarg z, 


all this up to 2kiz. For example, for x real, 


(20.7) foe) = 5 log(2? cay aeee ay, 


Now we saw in Chap. IV, § 4 (v) that in the open set G = C — R_ obtained 
by removing the real half axis x < 0 from C there are uniform branches 
of the pseudofunction Log; such a branch is a (true) continuous function 
(and in fact analytic) L(z) which, on G, satisfies the relation z = e/) for 
every z; every other solution is obtained by adding a constant multiple of 277 
to L(z), and the simplest solution, which one generally calls the principal 
determination of the log in G, is that for which 


(20.8) L(z) = log |z| + iarg z with | arg z| < 7; 
this function is even analytic and satisfies 
(20.9) L'(z) =1/z, 


the derivative being taken, of course, in the complex sense (Chap. II, n° 19). 

To extend the formula (4) to the case where a is complex it is enough to 
consider the function « +> L(x—a) on R. Since a is not real, the points x —a, 
situated on the horizontal through a, all lie in the open set G = C — R_; the 
function 


(20.10) L(a—a) =log|az —al|+iarg(a—a) with |arg(a —a)| <7, 


obtained by composing « + x — a and the analytic function L, is ipso facto 
of class C' in R and its derivative is the function L'(x — a) = 1/(x — a) 
by Chap. III, n° 21, Example 1, where we showed generally that if g(z) is 
holomorphic and f(t) is differentiable then 


From this we deduce that up to an additive constant 


dx 


w—a 


(20.11) = L(x — a) = log |x — a| + targ(a# — a), agR, 


where the argument must be chosen between —z and +7 so that the right 
hand side will, at least, be a continuous function of x, and so in fact C'°°. 
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Suppose for example that we are to integrate 1/(a—7) from x = 1 tox = 2. 
We have to calculate the variation of the function log |x — i| + iarg(x — 2) 
between these values. That of the log is 


log |(2— #)/(1—a)| = 5 loa(5/2). 


The points x — 7 lie below the real axis, so that one has to choose their 
arguments between —7 and 0; then arg(2—7) = —7/6 and arg(1—7) = —7/4. 
Finally the desired integral equals 


1 
5 (log 5 log 2) + im/12. 


Let us now give two examples of application to primitives of rational 


functions. 
/ dx 
(x2 +1)?" 


(20.12) 1/(a? +1)? = Af(x—i)? + B/(ex-i)t+ 
+ Bi(e+i) +A’ /(e+i)? 


Example 1. Calculate 


We write 


with coefficients to be determined. Multiplying through by (x — i)? one finds 
1/(x +7)? on left hand side and, on the right hand side, A plus terms con- 
taining factors x — i, so zero for x = 7. Consequently A = 1/(2i)? = —1/4. 
Similarly, A’ = —1/4. The terms in A and A’ have sum 


1 
i[(e— 4)? + (@ + 4)"| /A(@* +1)? = —5 (2? - 1)/(2* +1)"; 
on substituting this in the left hand side of (12) one obtains the relation 
1 
5 (a +1) =B/(x@-i)+ B'/(x +2); 


here again, one multiplies through by «—7 and puts « = 7 in the result; this 
gives B = 1/47 = —7/4 and likewise B’ = 1/4. Whence finally 


1/(z? +1)? = —1/4(a — 1)? — i/4(@ — 1) + i/4(a + i) — 1/4(2 + 4)?. 


Consequently, 


-/oop =3(4+s5)+3 [L(x + i) — L(x — i)}. 


The rest of the problem is now to express this result in real terms. Now 
L(a +1) = dlog(a? + 1) + iarg(a + 7) with a similar formula for L(x — i); 
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it is clear on the other hand (sketch!) that arg(a# — i) = — arg(x + 2) if one 
chooses the arguments between —a and +7. The expression between [ | is 
therefore equal to 2iarg(a +7), the argument being chosen between 0 and 7 
since « +7 is above the real axis. Thus, up to an additive constant (we are 
calculating a primitive), 


1 
5 arg(x + 7) with 0 < arg(x +i) <7. 


To reduce to a more familiar expression we note that the argument t of «+i 
satisfies tant = 1/x, whence 


I= 2/2(x? +1) 


arg(# +7) = arctan(1/x) = 1/2 — arctan x 4+ 2kn. 


The left hand side having to be a continuous function of x € R and the 
function arctan also being so, if one insists on values between —7/2 and 
+7 /2, the integer k must be independent of x; for k = 0, one actually finds 
for the right hand side a value between 0 and 7, as it must be. The constant 
1/2 being unimportant in calculating the desired primitive, we conclude that 


1 
T=2/2(x2? +1)+ 5 arctan x + const. 


The reader may, as an elementary prudence, check the result by calculating 
its derivative; the fact that it is real is already a good sign ... 

The reader will find very many examples of this technique in all the 
textbooks, although the great majority of authors recoil from the complex 
log. Moreover one is not always forced to use these when the denominator Q of 
the given real rational function has complex roots. For then, taking together 
the conjugate imaginary terms of the decomposition (2) in the case of a real 
function, one is led to sums of expressions of the form (Av+ B)/(x?+px+q)”" 
where the trinomial x? + px + q, with p and q real, has no real roots, i.e. can 
be written (x — a)? + b? with a, b real and b 4 0. The change of variable 
x = ay +b then reduces it to calculating integrals of the form 


ie fare 1) ee pedal +1)", 
Since 1 is the derivative of the function x, an integration by parts gives 


Ty, = a} (x? + 1" =n f a%de/(a? +1)" — 


= «f(z? +1)" —2nl, + 2nIn41 


since x? = (x? + 1) — 1; on replacing n by n — 1, this relation can again be 
written as 

(Qn — 2)I, = (Q2n—1)In_-1 — a /(x? +1)", 
which allows us to calculate step-by-step, starting from J; = arctan z; one 
can even calculate the general formula for [,, directly, but it is clearly not 
worth doing. A similar method applies to the J,,. 
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Example 2. Leibniz having believed he would flabbergast Newton with his 
series for 7/4, the latter retorted that he knew others, and better ones, es- 
pecially the formula 


(20.13) n/2V/2 = 14 2(1/3.5 —1/7.9+1/11.13—...), 


but of course without presenting the proof. 
But we know that he derived it from integrating the function 


l+a? 1 1 il 1 


1+ 24 - 272 -—a2J/2+1 272+ a/2+1 


Putting ¢ = +1, one has x? —exV/2+1 = (a — e//2)° + 1/2, which suggests 
the change of variable x — ¢//2 = y/V2; then dx = dy/V/2 and 2? — ex/2+ 
1 = $(y? + 1); consequently, 


v3 | dx / (x? —erV2+ 1) - fees) = arctan y = arctan (zv2- e) : 
One deduces 


T+ 2 
vi | i 7 dx = arctan (xv2 +1) + arctan (xv2- 1). 


rv 


But the addition formula 


tanu + tanv 


t tv) = 
ener 2) 1 — tan u. tanv 
shows that ae 
arctan x + arctan y = arctan Z + kr. 
1— xy 
An easy calculation then shows that 
a eee V3 =a? 
(20.14) 2 | de = arctan E (1—a ) 
whence 
142? 
(20.15) v3 | Tas de = arctan [ev2/(1 = )| 
0 yee 


for0<t<1. Ast tends to 1, tV2/(1 — t?) tends to +00, its arctan tends to 
m/2 and finally one finds 


1 T+ ge 
(20.16) i Ta gi = n/2V/2. 


On the other hand, the integrand is represented by the power series 
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Can one integrate this term-by-term between 0 and 1? 
There is no problem in integrating over [0,t] with t < 1. One finds 


¢—-/5+2/9—...)+(8/3-¢7/7 +44 /1-...). 


It remains to pass to the limit in each series as t tends to 1. Once again 
we have alternating series with decreasing terms: by Leibniz’ estimate of the 
remainder, namely t”/n < 1/n, the partial sums s,,(t), clearly continuous 
for |t| < 1, converge to the total sum s(t) uniformly on the closed interval 
|t| < 1, so that it is a continuous function of ¢ for t < 1. One may then write 
that 
lim s(t) = lim lims,(t) [by definition of s(t)] = 
t—1-—0 t—1-0 n 


= lim lim S(t) [Chap. III, n° 12, Theorem 16 ] = 
= lims,(1) [since s, is continuous | = s(1). 
By (16), one finally finds 
n/2/2 = (1-1/5 +1/9-...) + (1/3 -1/74+1/11-...) 


and has only to rearrange the terms to obtain Newton’s series. 
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§ 7. Generalised Riemann integrals 


Up to now we have attempted to integrate only bounded functions on a 
bounded and, most often, compact interval. To get further we have to free 
ourselves from these restrictions. The method is quite similar to that for pass- 
ing from the partial sums to the total sum of a series. To avoid complications 
which would not be helpful at this level, we restrict ourselves to regulated 
functions, i.e. those having right and left limits at each point, and so are 
integrable on every compact interval (or even bounded interval, if they are 
themselves bounded: n° 7, corollary to Theorem 6); if we really wanted to 
generalise, we would have to go to the grand integration theory (Appendix). 
This would moreover allow us to simplify many proofs very appreciably, also 
the somewhat artificial statements needed in order to remain at the “elemen- 
tary” level. 


21 — Convergent integrals: examples and definitions 


Suppose for example that we wish to assign a meaning to the integral 


ik ie 


It is natural, if only for a geometric reason, to consider it as the limit of the 
integral over (u,b) as u > 0 tends to 0. This is equal to log b—log u and, hard 
luck, therefore tends to +co, which is not the result hoped for, even if, after 
all, we have attributed the sum +oo to divergent series with positive terms. 
One could also attribute to 1/xz the value +co at x = 0, so obtaining a lower 
semicontinuous function on [0,b] to which one applies the definition of the 
integral given in n° 11; the result is the same, as one sees on calculating the 
integrals over [0, b] of the functions inf(n, 1/2) and then passing to the limit 
as n — +oo. Replacing 1/a by x* with s real, a function of which a primitive 
is 2°*!/(s +1), one obtains the same result if s +1 <0. For s+ 1> 0 the 
integral over the interval [u,b], equal to b§t!/(s+1)—ust!/(s+1), tends to 
b’*1/(s +1); whence “clearly”, i.e. by definition, 


b 
(21.1) | x'dax = b§t1/(s +1) if s >—land b> 0. 
0 


+oo 
a>dx 
a 


with a > 0 to eliminate a possible difficulty at « = 0, and where s is real. 
It is natural to consider this as the limit of the integral over (a,v) as vu 
increases indefinitely. If s = —1, one finds log v — loga, which tends to +00. 


Consider next the integral 
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If s ~ —1, one finds v°t"/(s+1)—a'*!/(s+1). Ifs > —1, the result increases 
indefinitely. If s < —1, it tends to —a**1/(s + 1), whence the formula 


+00 
(21.2) if z'dxz = —a®t'/(s+1) fors<—1, a>0. 


We remark in passing that the hypotheses on s that give a meaning to the 
integrals (1) and (2) are mutually exclusive; in other words, the integral 


+00 
| xidx IS NEVER FINITE. 
0 


Let us generalise. Let f be a regulated function on a noncompact interval 
X = (a,b), either not bounded, or bounded but not closed. If one argues as 
we did in defining convergence of a series, one is led to associate a “partial 
integral” 


(21.3) s(K)= i f(x)dz = [ f(a)dx = s(u,v) = F(v) — F(u), 


to every compact interval kK = [u,v] contained in X, where F' is a primitive 
of f in X. We then say that the integral 


b 
(21.4) (x)= f fade =f f(a)da 


converges if s(K) tends to a limit — which, by definition, will be the inte- 
gral (4) — as K “tends to” X, i.e. when u and v tend respectively to*? a and 
b. As in the case of a series [Chap. II, eqn. (15.4)], this means*! that for every 
r > 0 there exists a compact interval K C X such that, for every compact 
interval kK’ Cc X, 


(21.5) K CK’ => |8(K") — 8(X)| <r. 


One could also then say that f is “integrable” on X, but it is better to 
abstain carefully from this when the integral of |f| does not converge, this 
term having been reserved, in the only theory which has counted for a long 
time, that of Lebesgue, for absolutely integrable functions. It would be better 
to speak of semiconvergent integrals when f | f(x)|dax does not converge. 
(5), again, means that, for every r > 0, one has |s(X) — s(u,v)| <r once 
u is close enough to a and v close enough to b. If, for example, a is finite 


40 If X and f are bounded, in which case f is integrable on X in the sense of n° 2 
(n° 7, corollary to Theorem 6), this definition is compatible with that of n° 2. 

4? One can define this type of limit precisely. Let y(K) be a function of a variable 
compact set kK C X. One says that it tends to a limit u as K tends to X if 
for every r > 0 there exists a compact K C X such that K C K’ Cc X => 
ju — y(K")| <r. This is definition (21.5). 
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and b = +00, this means that, for every r > 0, there exists an r’ > 0 and an 
N > 0 such that 


{(u-a<r’) & (v>N)} => |8(X) — s(u,v)| <r. 


In terms of primitives, 


b 
(21.6) } f(x)dx = ee Fw) - go F(u), 
so that the integral converges if and only if F has finite limit values at a and 
b. These limits actually exist if the integral converges, for, in this case, one 
has |s(u’,v’) — s(u”’,v"’)| < r if u’ and u” are close enough to a and v’ and 
v” close enough to b; taking v’ = vu”, one sees then that |F(u’) — F(u”)| <r, 
so that Cauchy’s criterion is satisfied by F(u) as u tends to a. It is clear 
conversely that the integral converges if F' has limits at the end-points of X. 
This is exactly what we have verified in the preceding examples. The 
method also works for integrating an exponential function e© with c real 
and nonzero, since the behaviour of the primitive e“’/c as x tends to +oo 
or —oo has been elucidated in Chap. IV. In particular, one may integrate e*” 
from —co to any finite limit, but one cannot integrate from —oo, nor from a 
finite limit, to ++oo. 
But in general one does not know F’, so the usefulness of (6) is heavily 
constrained; it is better, in most cases, to examine the order of magnitude of 
f(a) as x tends to a or to b, just as one does for series. 


22 — Absolutely convergent integrals 


The theory of series is particularly simple when they are absolutely con- 
vergent. Likewise here. We shall say that the integral (21.4) is absolutely 
convergent if the integral of | f(a)| is convergent, in which case one may say 
that f is integrable over X (or, to reassure the reader who is starting these 
topics, absolutely integrable) without risking a collision with the Lebesgue 
theory. 


Theorem 18. (i) Let f be a positive regulated function defined on an inter- 
val X = (a,b). Then f is integrable on X if and only if the integrals over the 
compact subsets K C X are bounded above; and then 


(22.1) [ sx = sae [ soe: 


KCX 
(tt) let f be a regulated function defined on an interval X; if the integral 
| f(a)dx extended over X is absolutely convergent (i.e. if f is absolutely 
integrable on X ) then it is convergent and 


(22.2) | [ fever] < f licen. 
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To prove point (i) we observe that, f being positive, s(/) is an increasing 

function of Kk: 

KC K' => 8(K) < 8(K’). 
The arguments of Chap. II, n° 9 on increasing sequences transpose immedi- 
ately to here without our needing to expound them all again. One might also 
observe that, if f is positive, its primitives are increasing functions; these 
tend to finite limits at the end-points of X if and only if they are bounded 
on X. 

To establish the assertion (ii) one can reduce to the case of a real function, 
then to that of a positive function by writing f = f+ — f~. Since f* and 
f7 are majorised by |f|, point (i) shows that the integrals of these functions 
converge, so that of f too. One could also use directly one of the numerous 
variants of Cauchy’s criterion adapted to the situation, namely that the par- 
tial integrals s(/¢) tend to a limit if and only if for every r > 0 there exists 
a compact subset kK C X such that 


|s(K") — s(K")| <r for any K'D K and K" 5 K; 


but, since K’ and K” contain K, we have*? 


ex) —st] = [fl feayde— fh raya 


a 


ip fle), 
(K'—K)U(K"—K) 
the integral of |f| being extended over the set (K’U Kk") — K; if we write 
S(K) for partial integrals relative to |f| we have 


|s(K’) — 8(K")| < S(K"UK") — S(K), 


an arbitrarily small quantity for any K’,K” > K for K “large enough” if 
the S(A) are bounded above. 

The inequality (2) is obvious when integrating over a compact interval 
K Cc X, hence propagates to the limit, qed. 


Theorem 18 has some trivial consequences, which we use constantly. 


Corollary 1. Let f be a bounded regulated function on an interval X, and 
an absolutely integrable regulated function on X. Then the function f(x)u(x) 
is absolutely integrable on X and 


[somaya < isilx fn) 


“2 In what follows, we will have occasion to integrate over a set which is a finite 
union of intervals having, pairwise, at most one point in common; the integral will 
clearly, by definition, be the sum of the integrals extended over these intervals. 
This, furthermore, is equivalent to multiplying the integrand by the characteristic 
function, a step function, of this union. 
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Obvious. As we shall do on various occasions in the rest of this §, we have 
used the [ sign to denote integrals extended over X. 
Example: the Fourier transform 


(22.3) Aly) = [Pre u(ayae 


of an absolutely integrable function on X = R is defined for every y € R. 

In the following statement one says that a function f is square (under- 
stood: absolutely) integrable on an interval X if the function | f|? is integrable 
on X. One may then generalise Cauchy-Schwarz: 


Corollary 2. Let f and g be two regulated square integrable functions on an 
interval X; then the function f(x)g(x) is absolutely integrable on X and 


= [ireorrae. f laleyPac. 


One replaces f and g by |f| and |g|, writes the Cauchy-Schwarz inequality 
for every compact interval kK C X and notes that the left hand side is, for 
any K, majorised by the right hand side of the inequality to be established, 
whence the result on passage to the limit. Or else, see the end of n° 14, which 
will prove more generally that if the functions | f|? and |g|? are integrable for 
1/p+1/q=1, then fg is also integrable, and a Hélder inequality is valid. 


| [ foa@ ee 


The convergence conditions for the integrals involving «° established 
above for s real can be transformed immediately into conditions for absolute 
convergence in the case of a complex exponent, since 


Re(s) for x > 0. 


le*| =a 
On the other hand, point (ii) of Theorem 18 shows that if, on a neighbourhood 
of one of the limits of integration, one has a relation of the form 


f(x) = O(g(@)), 


then absolute convergence (on a neighbourhood of this end-point) of the 
integral of g(a) implies that of the integral of f(x); if one has the more 
precise relation f(a) =< g(x) then the integrals are of the same nature as 
concerns absolute convergence. (It might, on the other hand, happen that 
one of them converges non-absolutely and that the other diverges, as can 
occur with series.) 

In elementary practice the absolute convergence of an integral is almost 
always shown by comparing the behaviour of the integrand with that of a 
combination of classical functions: exponentials, powers, logarithms, etc. It 
is worth having these results permanently available, and to have understood 
their reasons. 
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First of all, putting X = (a,b), one may always choose a c such that a < 
c < b and decompose the integral into integrals extended over (a,c) and over 
(c, b). If the integrand is regulated there will be no problem of convergence 
on a neighbourhood of c, and this allows us to isolate the difficulties. One 
may always, in the case where c is the right endpoint, reduce to the case 
where c = +00 by the change of variable c— x = 1/y. If c is the left endpoint, 
one may reduce to the case where c = 0 by x —c = y, or to c = —oo by 
g-c=-l/y. 

Now consider the prototype integral 


+o0o 
(22.4) / log” 2.2"e ** dx (a > 0) 


where m, n, s are a priori complex but can in fact be assumed real, since 
the modulus of the integrand is obtained by replacing the exponents by their 
real parts. In view of the orders of increase of the three functions involved, 
it is pretty clear that the convergence of the integral is, for s 4 0, governed 
by the exponential factor. It follows immediately from Chap. IV, n° 5, that 
log™ x.2” = o(e"™”) as x — +00 for every r > 0; 

the integral will thus converge if there exists an r > 0 making the integral 
of e'"—*)* converge, i.e. if r— s < 0, whence convergence for s > 0, strict 
inequality. For s < 0, the integrand grows indefinitely, whence divergence. 

In the case where s = 0, the change of variable x = e” leads us to integrate 
the function y™e("+)Y on a neighbourhood of infinity; the integral is thus 
convergent for n < —1 and divergent for n > —1. 

If, finally, we have n = —1, so that we are dealing with the integral of 
x log’ x, the same change of variable reduces to the function y™, whence 
convergence for m < —1 and divergence for m > —1. 

In conclusion, convergence is governed by the exponential function if this 
is actually present or, if it is absent (s = 0), by the power function if that is 
actually present; if these two functions are absent, the integral converges if 
and only if m < —1. 

One might study integrals similar to (4), but containing more simple 
factors, by the same method; you could, for example, introduce the factors 
log log x, or log log log x, etc. ..., which grow more and more slowly and do 
not affect the result so long as there are factors present which decrease much 
more rapidly than them. You might also insert a factor «~*, which tends 
to 0 fast enough to annihilate even the most vertical exponential functions 
... Ete. 

The study of an integral such as 


b 
(22.5) | | log x|" a" da (0 <b < +00) 
0 
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reduces to the preceding case; the change of variable « = 1/y transforms it 
into the integral of the function log” y.y~"~? on a neighbourhood of +00. 
The integral (5) is thus convergent if —n —2 < —1, ie. ifn > —1; it diverges 
ifn < —1. If mn = —1, the integral converges if m < —1 and diverges in 
the contrary case. Note in passing that (5) always converges for n = 0, ice. 
when the term x” is absent, a result due to the fact that on a neighbourhood 
of 0 a power of the log grows less quickly than any negative power of x, for 
example than the function 2~!/? whose integral converges (primitive: 2x!/?). 
For n = 0, m = 1, note that log x has primitive «log x — x, a function which 
tends to a limit, namely 0, when xz — 0. 

Exercise: extend this calculation to the case of an arbitrary integer m > 0 
by integrating by parts. 

The case of rational functions is particularly simple: if p and q are poly- 
nomials and q has no real roots then the absolute convergence of the integral 


+00 
i P(®) ae 
-co (2) 
depends only on the integer n = d°(q) — d°(p) since the function is equivalent 


to 1/x” to within a constant factor; so absolute convergence is equivalent to 
the condition n > 2. 


Example 1. Consider Euler’s ubiquitous Gamma function 
+00 
(22.6) I(s)= | e *a° ‘dz. 
0 


Absolute convergence at infinity is automatic, but, at 0, requires Re(s) > 0. 
An integration by parts*? then shows that 


+00 


+00 +00 
I(s+l)= ‘) e “a'dr = —e *x° + | e *a* dz 
0 0 


and since the integrated-out part is clearly zero, we have 


0 


(22.7) I'(s+1) =sI(s). 
Since it is clear that ['(1) = 1 we deduce that 
(22.8) I'(n) =(n-1)! 


for every integer n > 1. This is Euler’s method for defining the “factorial” of 
an arbitrary complex number. We shall see in n° 25 that the I” function is 
holomorphic in the half plane Re(s) > 0 and that it is even the restriction of 
a function holomorphic on C — {0,—1, —2,...}. 


* The formula f f’g = fg —f fg’ for integration by parts applies to the integrals 
considered here, on condition that we check that the function fg has limit values 
at the end-points of the interval of integration X: integrate over a compact 
kK Cc X and pass to the limit. 
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Example 2. Euler also studied the integral 


(22.9) B(z,y) = i #-1(1 -t)¥ "dt 


where x and y are a priori complex (and rational for him). Absolute conver- 
gence on a neighbourhood of 0 requires Re() > 0 and, on a neighbourhood 
of 1, Re(y) > 0. Clearly 


(22.10) B(x,y) = Bly, 2) 


(change of variable t +> 1 — t). The change of variable t > sin? t shows that 
n/2 
(22.11) B(a,y) = 2 | sin?”—1 t. cos?¥—! t.dt. 
0 


We shall see later (n° 26, Example 1) that 
(22.12) B(x, y) = Pa) P(y)/P (a + 9), 


a famous formula due to Euler with, as always, a proof which posterity, 
principally Jacobi, has rectified. It immediately provides the explicit value of 
(11) for z,y EN. 


23 — Passage to the limit under the / sign 


For generalised or “improper” Riemann integrals there are theorems on pas- 
sage to the limit which the Lebesgue theory has rendered obsolete, but remain 
usable at a more elementary level. For example: 


Theorem 19 (Poor man’s dominated convergence). Let (f,,) be a se- 
quence of regulated functions, absolutely integrable on an interval X C R. 
Assume that 

(i) the fn converge to a limit f uniformly on every compact K CX, 

(it) there exists a positive function p, integrable on X, and such that 
lfn(x)| < p(x) for any x and n. 

Then the function f is absolutely integrable on X and 


(23.1) | f@de=iim f fy(aae, 
First of all, it is clear that f, being regulated like the f,, is absolutely 


integrable on X since |f(x)| < p(a) for every x. Since p is positive and 
integrable, for every r > 0 there exists a compact interval K C X such that 


(23.2) [rbot <r 
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(and even for every K’ > K), whence the same relation for each |f,(x)| and 
for | f(x)|. On the other hand 


(23.3) a | f(a (a)|da <r for n large 


since f,, converges uniformly to f on K. Now 


[se yaa f fale \ides| X < f \nte)- Cait 


is the sum of the analogous integrals extended over K and X — K; by (2), 
the second is < 2r since | f(x) — fn()| < 2p(x) for any x and n; the first is 
<r for n large by (3). The left hand side of the preceding relation (and even 
the right hand side) thus tends to 0, qed. 


Example 1. Consider the function 
+oo 
reye | e-*a'lde, -Re(s) > 0, 
0 


again, and observe that 
e*a°-! = lim(1 — 2/n)"2°"}. 


We cannot just bluntly apply Theorem 19 since the functions on the right 
hand side are not integrable between 0 and +oo: convergence at 0 presupposes 
Re(s) > 0 and convergence at infinity Re(s) < —n. For x < n we always have 
(Chap. III, n° 16] 


log[(1 — 2/n)"] = n.log(1 — 2/n) = —2@ —27/2n—...< —2 


and so (1 — a/n)" < e~*. Now consider the functions 


_ f Q=a/n)"2s-! for 0<a<n, 
(264) fn() = { 0 for x>n; 
they converge to the absolutely integrable function e~*x°~! and satisfy 


|fn(x)| < |e~*x’1|. To be able to apply Theorem 19 it therefore suffices 
to show that convergence is uniform on every compact subset of ]0,-+oo[. If 
we accept this point provisionally we then find 


my 1 
I'(s) = tim [ (1— a/n)"a* dex = limn’ [ (i a)" ut ds 
o 0 


on integrating by parts a la Leibniz, i.e. without limits of integration, we find 
that 


fa — uJ" tdu = (1—u)"at/s + ™ fo — u)"lutdu 
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and since the integrated-out part is zero for u = 0 [because Re(s) > 0] and 
u = 1, we find 


1 n pt 
/ (1 —u)"ue "du = — ) (1 —u)"—12°dz; 
0 0 


Ss 


whence, iterating, 


n! 
s(s+1)...(stn) - 


(23.5) i (1 —u)"uetdu = 


A little less than two centuries after 1812 and Gauss, who did not know 
that Euler, ever present, had preceded him along this path about 1776, as 
Remmert, Funktionentheorie 2, pp. 34-36, tells us, we find that 


(23.6) I'(s) =limn!n*/s(s+1)...(s+n) 


for Re(s) > 0. We can derive an expansion of I’ as an infinite product, but 
we still lack the necessary “Euler’s constant” which will appear in Chap. VI, 
n° 18. 

We still have to show that the functions (4) converge uniformly on every 
compact interval K = [a,b] with 0 < a < b < +ov. The factor x°~! being 
bounded on K since a > 0, it is enough to examine the factor (1 — x/n)”. 
For n > 6 we have |z/n| <1 in K and thus 


log [(1 — 2/n)"] = n.log(1 — 2/n) = — (x + 27/2n + 2° /3n? +...); 


the sequence of functions log[(1—2/n)"], thus also that of the functions 
(1 —2/n)", is therefore increasing on K, and even on [0,)], for n > b. Since 
it converges to the continuous function e~* uniform convergence on K follows 
from Dini’s Theorem of n° 10. 

More elementarily, so more complicated: first remark that log [(1 — 2/n)”"] = 
—x —2x7/2n—... converges uniformly to —z on [0,6], since, for n > b, 


b? b 
|x? /2n+ 27 /3n? +...) < —(1+b/n+b/n? +...) = ae 
n n— 
for every x € [0, }]. Since (1 — a/n)” = exp[n. log(1 — x/n)], it remains either 
to “dirty one’s hands” by calculating (exercise!), or to establish a general 
lemma to bypass the calculations: 


Lemma. Let K be a compact subset of C, let (fn) be a sequence of functions 
which converges uniformly on K to a bounded limit function f on Kk, and 
let g be a function defined and continuous on an open set U containing the 
closure of f(K). Then the composite function gy, = g° fn is defined on K for 
n large and converges uniformly to go f. 
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The limit f being bounded, the closure H of f(K) is compact so that, 
for every integer p, the set H, of the z € C such that d(z,H) < 1/p is 
also compact. Since U is open and contains H, it contains** an H,. Since 
|f—fnlla <1/p for n large, we therefore have f,(K) C H, C U, which allows 
us to define gn(x) = g[fn(x)]. Now the function g is uniformly continuous 
on the compact H,; for every r > 0 there is therefore an r’ > 0 such that 
g(2') — g(2")| <r if 2',2” © Hy satisfy |z’ — z"| < r’; now this, for n large, 
is the case for any « € K if one takes z’ = f(a) and z” = f,(x). Hence 
lgo f—gofillx <r for n large, qed. 

Note in passing that the lemma makes no hypothesis as to the nature of 
the fn; in particular, they are not assumed continuous; it is g which must be. 
But if the f, are continuous, the function f is so too, and the closure H of 
f(K) is in fact the compact set f(A) itself. 


If, instead of integrating a sequence of functions one integrates a series, 
one has to consider the partial sums s,,(a) of the series and apply the pre- 
ceding theorem to them. The simplest result is the following: 


Theorem 20. Let So un(x) be a series of absolutely integrable regulated 
functions on an interval X. Assume that (i) the series converges uniformly 
on every compact K C X; (it) there exists a positive function p(x), inte- 
grable on X, such that S> |un(x)| < p(x) for every x € X. Then the function 
s(Z) = )o n(x) is absolutely integrable on X and 


(23.7) J steyae = Df un(a)ae. 


The hypothesis (i) shows that the partial sums s,,(x”) converge uniformly 
on every compact K C X; since (ii) shows that |s,,(a)| < p(x), one need only 
apply the preceding theorem to the sp. 

Condition (ii) is analogous to normal convergence on all of X, but more 
restrictive. In fact, and in contrast to the case of integrals extended over a 
compact interval, normal convergence in X is not enough to assure (7) if 
X is not bounded. We do know then, of course, that for n large the differ- 
ence between the total sum s(x) and the partial sum s,,(x) is < r for any 
x € X since this is majorised by the n-th remainder of the series S> vp, which 
dominates the series )> u,(x). But we cannot extract any estimate for the 
difference between their integrals over X if X is not bounded. 

One may however establish a useful result whose formulation is very close 
to that of one of the fundamental results of the Lebesgue theory: 


44 The H, M(C — U) are closed and bounded and form a decreasing sequence of 
compacta; their intersection, contained simultaneously in H (because H = NHp 
for every compact H) and in C—U, is empty; thus H,N(C—U) = 9, ie. H, CU, 
for p large: Chap. III, n° 9 or corollary 1 of BL. 
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Theorem 21. Let X be an interval and un(x) a series of regulated functions 
which converges normally on every compact K Cc X. Assume that 


(23.8) =) |un(x)|da < +00. 


Then the function s(x) = )> un(x) is absolutely integrable on X and 


‘ s(x)dx = \~ ‘| Un(a)da. 


Let us put, generally, 


mr(f) = | f(a)de 


and consider a compact interval kK C X. Since the given series converges 
normally on K (“uniformly” would suffice) and one may integrate term-by- 
term on a compact set (n° 4), the relation |s(x)| < > |un(a)| shows that 


mic (Isl) < mic (So lunl) = Sm (lunl) < So rx (unl) = M < +90 


by (8). The regulated function s is therefore absolutely integrable on X [The- 
orem 18, (i)], with mx(|s]) < > mx(Jup|). Omitting the first N terms from 
the series, one finds in the same way that 


fle ae 3 mx (|Un|). 


p=1 p=N+1 


The result is < r for N large since }> mx(|un|) < +00. It follows that 


N N 
lmx(s) - Yo mx up) < mx(|s ~ >») <r for N large, 
p=l1 


p=1 


whence the theorem. 

In the Lebesgue theory, the two preceding theorems are valid without the 
hypothesis of uniform or normal convergence, which would considerably sim- 
plify the arguments of Example 1; simple (or even only “almost everywhere” ) 
convergence is enough to assure the result; in fact, the hypothesis (8) even 
implies “almost everywhere” absolute convergence of the series }> un(x), as 
we shall see. On the other hand, hypothesis (ii), “dominated convergence” , 
is essential even in the “grand” integration theory, where one ignores (in the 
English sense) the “semiconvergent” integrals, so specific to R. 


Instead of integrating a function of « depending on an integer n one may 
consider more generally an integral of the form 
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- f(x, y)dx 


where y varies in an arbitrary subset Y of R or even of C, and examine what 
happens when y tends to a closure point b of Y. The hypotheses to make are 
obvious: 


(i) the function «+ f(z, y) is regulated for every y; 
(ii) lim,—, f(z, y) = g(x) exists for every x € X and the limit is uniform on 
every compact K of X, i.e. for every r > 0 there is an r’ > 0 such that 


f(x,y) —g(x)| <r for every rE K 


for every y such that |y — b| <7’; 
(iii) there exists a positive integrable function p on X such that |f(x,y)| < 
p(a) on X for every y € Y close enough to b. 


Then we may write 
lim f f(e.w)de = f de lim f(a, y). 
yoo yb 


The hypotheses (i), (ii) and (iii) show that g is regulated and absolutely 
integrable, after which it is enough to copy the proof of Theorem 21, replacing 
fn(x) everywhere by f(x,y) and the expression “for n large” by “for y close 
enough to b” (or “for y large” if y tends to infinity). We could have established 
this general result directly; the theorem for sequences can be deduced from 
it on taking Y = N and b = +00. 

In the most frequent applications of this result one seeks to show that the 
integral is a continuous function of y: 


Theorem 22. Let X be an interval, H a compact subset of C, f a function 
defined and continuous on X x H and us a function defined and regulated in 
X. Assume that there is a positive function p on X such that | f(x, y)| < p(x) 
on Xx H and f p(x)|u(x)|dx < +00. Then the function y+ f f(x, y)u(x)dx 
is continuous on H. 


Hypotheses (i) and (iii) above are clearly satisfied by f(x, y)u(x) when 
y tends to a b € H. If K is a compact subset of X, then the function f is 
uniformly continuous on the compact K x H; consequently, the hypothesis 
(ii) is satisfied also“. Thus lim f f(a, y)u(x)dx = f f(x, b)u(x)dz, qed. 


45 Recall why. For every r > 0 there exists an r’ > 0 such that the values of f at 
two points of K x H distant at most r’ from each other are equal to within r. 
It follows that |y — b| <r’ ==> |f(x,y) — f(a,b)| < r for every x € K, which 
means that, as y tends to b, f(x,y) tends to f(x, b) uniformly on K. The factor 
(x), which is bounded on every compact set, like every regulated function, does 
not change the conclusion. Note in passing that if we have introduced a function 
(x), it is because we do not yet know how to treat an integral f f(a, y)djs(x) 
with respect to an arbitrary measure on a noncompact interval. 
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In practice, the continuous function f is defined on X x Y where Y C C 
is not necessarily compact: the case of an arbitrary interval in R or of an 
open subset of C for example. To apply the theorem, it is enough to work 
on an arbitrarily small neighbourhood of a point b € Y since continuity is a 
property of local nature. So everything works if every b © Y has a compact 
neighbourhood in Y. Now a neighbourhood of b in Y contains, by definition, 
all the points of Y whose distance to b is sufficiently small. The hypothesis 
in question thus means that there exists an r > 0 such that the set of ye Y 
such that d(b, y) < r (weak inequality) is compact. A subset of C having this 
property at each of its points is said to be locally compact. This is the case 
if Y = FQU with F closed and U open*®: choose r so that the closed disc 
d(b,y) < ris in U, then take for a neighbourhood of b in Y the intersection of 
this disc with F: it is closed in C, so compact. In R, every interval is locally 
compact; Q is not (exercise!). In C, the union Y of the open disc D: |z| < 1 
and of the compact interval [1,3] is not, even though both these two sets are: 
the intersection of Y with a closed disc of centre 1 is never closed. We might 
have said all this in Chap. III, but the reader is perhaps grateful to have been 
spared this at the beginning of the theory ... 

In conclusion, Theorem 22 remains valid if one assumes H only locally 
compact. By a happy coincidence, the locally compact sets are, among the 
subsets of C, those on which one may construct an integration theory a la 
Lebesgue and, for a start, give a reasonable definition of Radon measures, as 
we shall see in n° 31. 


24 — Series and integrals 


One may sometimes compare an integral to a series, and vice-versa, to decide 
on its convergence or divergence. If, for example, f is a regulated function on 
an interval X = [a,+oo| with a finite, then f has a limit value at a and the 
convergence of the integral on a neighbourhood of a poses no problem; it is 
then clear that 


i If(2)|dx < +o SD 


n>a 


n+1 
/ Gide = 460 


n 


because the partial sums of the series are, more or less, the partial integrals 
over the intervals [a, n]. 

Consider now a function f defined for x > a finite, positive, decreasing, 
and tending to 0 at infinity (without this, for a decreasing function, the 
integral has no chance of converging); being monotone, f is regulated (and, 
in applications, is always continuous). For every n > a the integral of f over 
the interval [n,n + 1] lies between f(n) and f(n+ 1) since f is positive and 
decreasing. The series (1) is therefore of a similar nature to the series )> f(n) 


46 One can prove the converse, but it is hardly worthwhile. 
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and consequently, the integral of f on the interval [a, +00] converges if and 
only if the series )> f(n) converges, with 


+00 
(24.1) Sais [far < t+ fen) 


n>a sid n>a 


the term f(a) comes from the interval [a, p]| where p is the smallest integer 
>a. A sketch will make the result obvious. 

If for example f(x) =< c/x* with s real, the integral converges like the 
Riemann series )> 1/n*, i.e. for s > 1. 

There are also the integrals of “oscillating” functions. Consider for exam- 
ple the integral 


+0o 
(24.2) I | f(a) sin(ra)dx 


where f is again positive, decreasing, and tends to 0 at infinity. The integral 
between n and n+1 this time lies up to sign between f(n) and f(n + 1) since, 
on the interval considered, sin(7x) is either everywhere between 0 and 1, or 
everywhere between —1 and 0. This suggests comparison with the alternating 
series )+(—1)"f(n), which converges since f decreases and tends to 0. But it 
is better to compare with the series with general term 


Ue = ‘a f(x) sin(ra)dz. 


It is clear that the uy, are alternately positive and negative. On the other 
hand 


n+1 
Unt = -| f(a + 1) sin(ra)dax 


thanks to the change of variable x + x+1. Since f(a+1) < f(x), we conclude 
that |Un+1| < |u,|. Finally, and as we have seen, |u,,| always lies between f(n) 
and f(n+1), so tends to 0. The alternating series u,, is therefore convergent. 
Now let p be the smallest integer > a. For p< n<vu<n+1 we have 


i, fla)sin(aa)de = f.. + (ttp +... + tn—1) 4 ff. f(x) sin(ra)de. 


Since the last integral is, in modulus, < f(n) and so tends to 0, and since 
the series up, converges, it is clear that the left hand side tends to a limit as 
uv — +00, namely 


P 
i=} f(a) sin(wx)da +S un. 
& n>p 


The “remainder” of an alternating series being, in absolute value, less than 
the first term neglected, one thus obtains, for every n > a, the inequality 
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(24:3) r-f f(x) sin(rx)dx| < f(n). 


From this one can deduce an important result on the Fourier transform: 


Theorem 23. Let f be a positive regulated function, defined for x > a > —ov, 
decreasing, and tending to 0 at infinity. Then the integral 


+00 
ew) = fo fla)sinray)de 
a 
converges for any y #0, and is a continuous function of y. 


To see this, assume y > 0 and perform the change of variable 2xy = u, 
whence 


+00 
2yp(y) = | f (u/2y) sin(7u)du. 


Convergence is clear, and (3) can now be written 


n 


(24.4) uctw) _ f(u/2y) sin(ru) du 


2ay 


< f(n/2y). 


Let us work on an interval y > b > 0. We have f(n/2y) < f(n/2b), so that 
f(n/2y) converges uniformly to 0 on this interval. Thus it remains to show 
that the integral in (4) is a continuous function of y for any n; for then 2yy(y) 
will be the uniform limit of continuous functions on y > b. Now, returning to 
the initial variable of integration x = u/2y, the integral in question can be 
written 


n/2y 
i sin(27ray) f(x)dx, 


and its continuity as a function of y is clear, even though the upper limit of 
integration depends on y. The reader may provide the ¢, inspired by Theo- 
rem 13 of n° 12. 

Dirichlet’s integral 


+oo 
/ sin(2ray)dx/x 
0 


fits into this framework, for the function sin(27ay)/x tends to 2my at the 
origin, so that it is enough to examine its behaviour at infinity, given by the 
preceding theorem. (Note that in fact the integral does not depend on y). 
Same remark for the Fresnel integrals of the kind 


+oo 
| cos(2rxy)dzx, y £0; 
0 


the change of variable x? = t leads to 
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+oo 
| cos(2ryt)t~ 1/7 dt; 
0 


there is no problem at t = 0 since —t > -1. The problem is to calculate the 
integral explicitly. 
The preceding theorem applies also to the Fourier integrals 


+00 
f= fo fee ran, 
which, by Euler’s formulae, reduce to four integrals of the preceding type. 
Theorem 25 thus applies here when f(a) tends to 0 at infinity in a monotone 
fashion for |2| large: the integral converges for y 4 0 and f(y) is continuous on 
R* = R — {0}. This is the case, for example, if f(x) = p(x)/q(x) is a rational 
function for which d°(q) = d°(p) + 1; the function f tends to 0 monotonely 
at infinity because its derivative has only a finite number of roots, so has 
constant sign on a neighbourhood of +oo or —oo. We should not forget that 
the integral does not converge absolutely. 


25 — Differentiation under the / sign 


To extend the theorem on differentiation under the [ sign to “improper” 
integrals one considers as in n° 9 a continuous function f on a rectangle 
X x J where, this time, X is no longer compact, and one assumes that Do f 
exists and is continuous on X x J. In ignorance of what a measure is one 
may always examine an integral of the form 


(25.1) ty) = f fay)a(oae, 


where ys is a regulated function on X, and seek hypotheses to assure that 


(25.2) f(y) = f Dafle.y)ula)ae. 


We assume of course that (1) and (2) are convergent integrals for any y € J. In 
problems of this kind the principle is the same as for the analogous problems 
for series: one replaces X = (a,b) by a compact interval K = [u,v] contained 
in X, applies Theorem 9 of n° 9 to the function*’ 


(25.3) at) = f fenla)de, 


47 We remarked at the end of n° 9 that Theorem 9 does not rest on the explicit 
construction of the usual integral, but only on its properties of linearity and 
continuity. These would be equally valid if one defined the integral by the for- 
mula p(f) = f{ f(x)u(«)dx. Theorem 9 does not apply directly to the function 
f(x, y)u(x), since it is no longer necessarily continuous, but the result still holds. 
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then one passes to the limit as u and v tend respectively to a and 8, i.e. as 
K tends to X. Since 


(25.4) de(y) = a Dof (ce, y)ul«)de 


by Theorem 9 of n° 9, we have to show that the derivative of a limit is the 
limit of the derivatives, a problem which Theorem 19 of Chap. III, n° 17 is 
there to resolve: since gx(y) tends to g(y) for every y, it is enough to show 
that g/,(y) converges to the integral (2) uniformly on every compact subset 
H of J as K tends to X. Equivalently, that for every r > 0 there exists a 
compact interval kK C X such that 


<r for every y € H and every K' > K; 


(25.5) | [Pale untae 


the integral (5) is actually the difference between g/,(y) and the integral (2) 
to which it must converge [and which we will have the right to denote g’(y) 
after having justified the passage to the limit]. A brutal way of guaranteeing 
(5) is to assume the existence of a positive function p7(z) such that 


(25.6) [Daf (,»)| <pu(a) with i pu(2)|u(a)|de < +00 


for any x € X and y € H; the left hand side of (5) is then majorised by the 
integral of pyz|u| over X — K’, so is < r for any y € H if K’ contains a large 
enough compact interval K Cc X. 

This is the argument used to show that if, for a series of differentiable 
functions the series of its derivatives converges normally, then one may dif- 
ferentiate it term-by-term. The integrals of D2f on the compact sets play 
the role of the partial sums of the derived series; the existence of a function 
pr satisfying (3) plays the réle of normal convergence and guarantees that 
for K Cc X large enough the “remainder” of the “sum” of the D2 f(z, y), i.e. 
the integral on X — K, is in modulus < r for any y. One cannot recommend 
the reader too strongly to let himself be guided by these analogies between 
“continuous sums”, i.e. integrals, and “discrete sums”, i.e. series. 

We thus obtain a simple but useful result: 


Theorem 24. Let X and J be two intervals in R, let w be a regulated func- 
tion on X and f a function defined and continuous on X x J. Assume that 


(i) the integral 
ay) =f Fle.wua)de 
x 


converges for every y € J; 
(ti) the function f has a continuous partial derivative Do f(x,y) on X x J; 
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(itt) for every compact H C J there exists a positive function py on X 
such that |Dof(x,y)| < py(x) for every x € X and every y € H, and 
J pu(x)|u(2)|da < +00. 


Then the function g is differentiable and 


(25.7) gy) = i Dof (a,y)ula)de. 


Example 1. If X = Y =R, if y is an absolutely integrable regulated function 
on R and if f(z,y) = e-?7"*¥, then the function g(y) is just the Fourier 
transform ft of ys. Here 


Dof (x,y) = —2rixe~?™*¥ 


and so |Dof(x,y)| = 2a|2| = p(x), and this is clearly the smallest positive 
function which dominates «+> D2 f(x,y) for a (or for all) y € Y. Conclusion: 


of 
_ |ap(x)|dx < +00, 
R 


then fi is differentiable and 
i (y) = —2ni f xp(ax)e 27" da 
R 


is the Fourier transform of —2rixp (a). 


Example 2. In particular choose p(x) = exp(—7a?), an integrable function 
on R since it decreases at infinity more rapidly than |2|~” for any n > 0. We 
have —2nriap(x) = ip'(x), whence, integrating by parts, 


+oo +coo 
i (y) = if (a) exp(—2rixy)dx = any [ p(x) exp(—2rixy)dx 


—co —oo 


since the integrated-out part is zero because of the decrease of yu at infinity. 
One obtains the relation 


i (y) = —2ryfily), 


a relation satisfied equally by yw. The function fi/ therefore has derivative 
zero, whence 


(25.8) july) = cu(y) = cexp(—my”) 


with a constant 


(25.9) c= fi(0) = [ew(-r2)ae. 
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It will emerge later that c = 1, and this without the least calculation, thanks 
to the general Poisson summation formula 


S— un) = $5 a(n), 


where the sums are over Z. In part this explains the role of the function 
exp(—7?) in the calculus of probabilities (Gauss’ normal law). 


The preceding theorem can be used to show that a function is holomor- 
phic: 


Theorem 24 bis. Let X be an interval in R, U an open subset of C, w a 
regulated function in X, and f a function defined and continuous on X x U 
satisfying the following conditions: 


(i) the integral 


a2) = fF 2)u(tae 


converges absolutely for every z € U; 

(ti) the function z + f(t,z) is holomorphic on U for every t € X and its 
derivative f'(t,z) with respect to z is continuous on X x U; 

(itt) for every compact H C U, there exists on X a positive function px(t) 
such that |f’(t,z)| < pa(t) for every t © X and every z © H and 
J pu(t)|u(t)|dt < +00. 


Then g is holomorphic on U and 


(25.10) (=f feautnar 


Putting z = «+ iy, Theorem 24 shows that one may differentiate under 
the f sign, either with respect to x for y given, or with respect to y for x given. 
Since z+> f(t, z) satisfies the Cauchy’s holomorphy condition [Chap. II, eqn. 
(19.10)], so clearly g does too, ged. 


Example 3. If « is a regulated function on the closed interval [0,+00[ and 
is O(t’) at infinity for some N, its Laplace transform or complex Fourier 
transform 


+00 
Lulz) =| ert 


is defined on U : Im(z) > 0 since then |e?"”*u(t)| = O(e~?™¥t) at infinity. 
Here f(t,z) = e?™*, whence |f’(t, z)| = 2mte~?™¥. Since every compact 
subset H of U is contained in a half plane Im(z) > o > 0, we have, in H, 
that |f’(t,z)| < 2nte~°"*' = py(t) with f pr(t)|u(t)|dt < +00 since the 
function py(t)u(t) = O(e~?77't% +") is absolutely integrable on R,. The 
function L,, is therefore holomorphic on U. 
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This calculation, iterated, shows further that the (complex) derivatives 
of L,, are given by 


+00 
(25.11) L&)(z) = (2mi)” : erry? u(t) de. 
0) 


Example 4. The function I'(s) = fe~*x%~'dzx is holomorphic in the half 
plane Re(s) > 0 where it is defined. It is clear that 


(i) the function s + e~*x%~! = e~* exp|(s — 1) loga] is holomorphic for 


every x > 0 since it is the composite of two holomorphic functions; 

(ii) its complex derivative** e~*x*~! log x is continuous; 

(iii) if s remains in a compact subset H of the half plane Re(s) > 0, strict 
inequality, then s is subject to conditions a < Re(s) < b withO<a< 
b < +00, so that 


ee? 4| Joga), af 0 asl, 


—xz,,s—l cs 
ae tog] < pula) = | e*g>-llogs if 1<a2<-+00 


Now the integral of «*~!loga converges at 0 for a > O and that of 
e-“x°—! log « converges at infinity for any b. Whence dominated convergence 
and the result. 

One sees at the same time that 


+oo 
(25.12) I(s) =| e*x*—| log z.dz. 
0 


Example 5. Let us write 
1 +oo 
(25.13) I'(s) = ‘| eal da + / era? de. 
0 1 
The second integral converges for any s € C. So, as in the preceding example, 


is a holomorphic function of s in all of C. In the first integral, term-by-term 
integration of the exponential series gives, for Re(s) > 0, 


: Z ms ; n+s—1 —1)” 
asa) feet tie =P fant d= gay 


the operation is justified because (i) the series )>(—1)"2"ts~1/n! to be inte- 
grated over X =]0,1] converges normally on every compact K Cc X, (ii) the 


series 
Dolan jal] = ee — pia) 


48 Recall (Chap. III, n° 21, Theorem 22) that the chain rule valid for functions of 
a real variable is also valid for holomorphic functions of a complex variable. 
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is integrable on X since Re(s) > 0: so Theorem 20 applies. 
The result (14) is a series of holomorphic functions on the open set 


(25.15) ied Cee (ty Deny ee 


a series which converges normally on every compact*? H C U. If we knew 
generally that the sum of such a series is again holomorphic, we could deduce 
from this and from (13) that I’(s) is the restriction to the half plane Re(s) > 0 
of a holomorphic function on U. We do not know this yet, even though we 
know (Chap. III, n° 22) that if a sequence or series of holomorphic functions 
on an open subset U of C converges uniformly on every compact set and so 
does its derived series, then the sum of the given series is holomorphic, and its 
derivative is obtained by differentiating term-by-term. We stated then that 
this result, a trivial consequence of the Cauchy equation and of the much 
more general theorem on sequences or series of C! functions in the plane, is 
much too weak to be of interest, but here it will suffice for our needs. It all 
reduces to showing that the derived series 


Yo /nl(s +n)? 


of (14) converges normally on every compact H C U, which is clear. 
Another procedure. An integration by parts shows immediately that 


1 1 
1 1 
i, e a? lde = — + -/ e “a*dz 
0 S$ 8 Jo 


for Re(s) > 0; but the integral obtained converges for Re(s) > —1 and de- 
pends holomorphically on s in this half plane as in Example 4; this allows us 
to extend the function®® I" analytically (it would be better to say holomor- 
phically at this stage of the exposition) to the half plane Re(s) > —1 minus 
the point 0. This done, a new integration by parts yields the relation 


: i 1 1 : 
(25.16) [ete ta= | | [ete as 
0 s s(s+1) s(s+1) Jo 


with an integral converging now for Re(s) > —2 and so holomorphic in this 
half plane. Pursuing the calculations, one defines I'(s) in all the half planes 


4° Tt suffices to prove the existence of an r > 0 such that |s+n| > r for any s €¢ H 
and n € N. This is equivalent to saying that the distance d(H,—N) between the 
closed set H and —N is > 0, which follows from the fact that they are disjoint, 
with H compact. One may also argue directly. 

Given two open sets U C V in C and an analytic (resp. holomorphic) function on 
U, to extend f to V analytically (resp. holomorphically) consists of constructing 
an analytic (resp. holomorphic) function on V coinciding with f on U. If V 
is connected the analytic extension, if one exists, is unique (Chap. II, n° 20). 
Recall also (Chap. VI, n° 14) that the terms “analytic” and “holomorphic” are 
synonymous. 
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Re(s) > —n, apart from the points 0,—1, etc. where the rational fractions 
1/s, 1/s(s +1), 1/s(s + 1)(s + 2) etc. appear. 

The final result is that one may extend the function I’ holomorphically 
to the open set (15) and that it is given there by the formula 


+00 co pn 
(25.17) r(s)= | tat Met Gea) 


+n) 


in which everything converges for any s # 0,—1,.... As we shall see in 
Chap. VII, n° 20, Example 4, these various methods of defining I'(s) be- 
yond the half plane Re(s) > 0 all lead to the same function. 


26 — Integration under the /{ sign 


We saw in n° 9, Theorem 10, that if f is a continuous function on K x H, 
where K and H are compact intervals, then 


fe f sendy = fay f F(e,y)ae. 


Does this result extend to arbitrary intervals? Yes, on condition that, as 
always, one imposes hypotheses of domination by fixed integrable functions. 


Theorem 25 (Poor man’s Lebesgue-Fubini). Let X and Y be two in- 
tervals and f a continuous function on X x Y. Suppose that the following 
conditions are satisfied: 


(i) for every compact K C X there exists a positive integrable function 
ax(y) on Y such that |f(x,y)| <ax(y) in Kx Y; 

(ii) for every compact H CY there exists a positive and integrable function 
p(x) on X such that |f(x,y)| < p(x) on X x H; 

(itt) one of the two relations 


Q61) fav f \ewlav<+oo, fy f(s ewlde < +00, 


is satisfied. 


Then the two relations (1) are also satisfied, and 


(26.2) fe f tenay= fav f f(a, y)da. 


In what follows we shall put 


(26.3) go() = i Gwin. HG | f(a, y)de 
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for any intervals J C X and J C Y. We shall also employ the notation m; to 
denote an integral over J. 

First we note that, by Theorem 22 of n° 23 for 4 = 1, and by the hy- 
potheses (i) and (ii), the functions (3) are continuous for any I and J. Let 
us start from the relation 


(26.4) melon) = fae f fendv= fay | fle.uae = muh) 


valid for any compact®! K Cc X and H CY (n° 9, Theorem 9). The whole 
problem is to pass to the limit under the [ signs as K and H tend to X 
and Y. 

(a) First we show that we may pass to the limit with respect to H for 
K fixed. Hypothesis (i) shows that f(x,y) is absolutely integrable on Y for 
every x € X and that, further, 


(26.5) if feaay— f Fesddy < fatty for every « € K, 


a result < r for H large enough since qx is integrable on Y. Since the left 
hand side can also be written as |gy (x) — gu(2x)|, (5) shows that 


lloy -gHllk <r 


for H large enough. Since we may pass to the limit under the / sign when 
we integrate uniformly convergent continuous functions on a compact set, we 
obtain 


(26.6) jim, mx (gar) = mK (gy). 


(b) Next we have to pass to the limit along K’. By (4) and the definition 
of an integral extended to X or Y, we have 


(26.7) my (hx) = jim ma(hx) = lim mx (gx); 


there is no problem of convergence since, on the left hand side, the integral 

on K, i.e. the function hx(y), is majorised in modulus by m(K)qx(y) by 

hypothesis (i), whence the absolute convergence of the integral over Y. 
Comparing (6) and (7), we find 


(26.8) my (hk) = mx (gy) 


for every compact interval K Cc X, which would be (2) if X were compact. 
Likewise we find 


°! The reader will already no doubt have observed that we often omit the word 
“Gnterval” . 
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(26.9) mx (gH) = mua(hx) 


for every compact H C Y. It remains to pass from here to (2), a relation 
which can again be written 


(26.9’) mx (gy) = my(hx). 


(c) These results do not rely on hypothesis (iii) of the statement and 
remain valid if one replaces f by the function |f|, which satisfies (i) and (ii) 
like f. Suppose now that f dx f{ |f(2,y)|dy < +00. Applying (9) to |f|, we 
obtain 


e610) f dy f fender = f ae f \nelaw< 
[ a [ \sewlay 


for every compact H Cc Y. Taking the upper bound of the left hand side as 
H varies in Y, we obtain (Theorem 18, (i)) 


eon) fay f tewilaes fae f itew)lay < +00. 


So we see that if the first integral (1) is finite, the second is too. But now 
one may argue starting from second as we have just done, starting from the 
first. Obviously one obtains the reverse inequality to (11), which must then 
in fact be an equality. Whence (2) for the function | f]. 

(d) It remains to obtain (2) for the function f itself. An easy method is to 
reduce to a real function by considering Re(f) and Im(f), functions which, 
bounded in modulus by |f|, again satisfy the hypotheses of the theorem. The 
function f being now assumed real, we write f = f* — f~ as always; these 
two positive functions, majorised by |f|, also satisfy the hypotheses of the 
theorem, and since they are identical to their absolute values (11) reduces to 
(2) for these two functions; whence (2) for f. 

Another method, which, unlike the preceding, could be applied to func- 
tions with values in Banach spaces, even of infinite dimension, consists of 
starting from (9) and showing that, as H — Y, the two sides of (9) converge 
to the two sides of (9’). This is the case of the right hand side by definition 
of my(hx). To examine the left hand side, first note that 


IA 


IA 


(26.12) Imx(gy) — mx(gH)| mx (\gy — gul) < 


fief. If(z, y)|dy 


for every compact H C Y. Since we are already able to invert the integrations 
for the function |f|, we can apply it to the “intervals” X and Y — H (the fact 
that Y — H is the union of two disjoint intervals does not change anything). 


IA 
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One may thus interchange the order of the integrations in the third term 
of (12); and thus obtain the integral on Y — H of a positive integrable func- 
tion on Y, a result < r for H large enough. The first term of (12) therefore 
tends to 0 as H — Y, so that the left hand side of (9) tends to that of (9’), qed. 


Given the conditions of the preceding theorem, one often says that f(z, y) 
is absolutely integrable on X x Y and puts 


(26.13) // fley)dedy = fax f r(e,ay = ff av [ f(x, y)de. 


XxXY 


In practice, one may often substitute the following condition for the hypothe- 
ses (i), (ii) and (iii): there exist positive, regulated and integrable functions p 
andq on X andY respectively such that | f(x,y)| < p(x)q(y) on X x Y. The 
hypotheses (i) and (ii) are satisfied on choosing 


PH(x) = \|lqllap(z) = and —s qx (y) = |[pllxa(y) 


for any K and H (the uniform norms are finite since p and q are regulated). 
The hypothesis (iii) is also satisfied, since, for example, [ | f(x, y)|dy < Mp(«) 
where M = f q(y)dy, whence the absolute convergence of the repeated inte- 
grals. 

The hypotheses (i) and (ii) are unnecessary in the complete Lebesgue- 
Fubini theorem and one contents oneself with hypothesis (iii), but one cannot 
again obtain a Rolls for the price of a VW. N° 33 of § 9 will provide, an 
inevitable intermediate stage, the LF theorem for semicontinuous functions, 
thanks to which one can justify what all the users do instinctively when they 
integrate a continuous function on a simple compact set in C. 

Exercise. Extend Theorem 10 of n° 9 to noncompact intervals X and Y. 


Example 1. First note that if the continuous functions f(x) and g(y) are 
defined and absolutely integrable on the intervals X and Y then the function 
f(x)g(y) is absolutely integrable on X x Y, and clearly 


ff feiacnasay = f_slorde food 


the product of the integrals of f and g. Now let us choose X = Y =]0, +00], 
f(x) = e~*x*! and g(y) = e~¥y’!, with Re(a) > 0 and Re(b) > 0. We 
obtain the relation 


(26.14) I(a)I'(b) = el erty tdedy = f de fay. 
XxXY 


If, for each x, one effects the change of variable y = (u~' — 1)a in the 
y-integration, one finds 
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(26.15) I'(a)I'(b) 


+oo 1 
/ ax 6 gee = u) ety oldu = 
0 0) 


1 +oo 
: (1 _ ur tattd f e U/tyatb—l gy. 
0 0 


the change of variable 7 = tu in the x integral for u given then yields 


I 


1 +oo 
T(a)P(b) = | (_- wrt tdu | CCR ae de = 


1 +oo 
= i (1 - uw) tutdu [ Br de: 
0 0 
whence the famous formula 
(26.16) I'(a)I'(b) = P'(a +b) B(a, b) 


announced above. 
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27 — How to make C@™ a function which is not 


In about 1926-1927 the physicist Paul-Adrien-Maurice Dirac, in Dublin, had 
the idea of introducing a function 6(z) on R (and even on R? or R*) possessing 
two supernatural properties: on the one hand 


d(x) =0 for «40, 6(0) = +00, 


while on the other hand®? 
J fos(oyae = f(0) 


for every just-a-little-reasonable function f; a little later, Dirac and the 
theoretical physicists juggled, in Relativity space, with “functions” such as 
6(c?t? — x? — y? — z?) and, as the inventor of the theory of distributions? 
has written, “lived in a fantastic universe which they knew how to man- 
age admirably, practically faultlessly, though never able to justify it at all’. 
Dirac had, to be sure, explained how one could “approximate” his function 
by considering, for ¢ > 0, the function equal to 1/2¢ for |2| < € and zero else- 
where, or the “bell curve” functions exp(—7a?/e)/,/e, whose integral over R 
is equal to 1 and whose graph, as ¢ — 0, more and more closely resembles 
a sky-scraper of infinite height and of zero base representing the function 6, 
but since furthermore he allowed himself to differentiate his “function” and 
to write formulae such as 


[ #08 ear = - 70) 


those mathematicians who tried to understand him understood nothing. It 
was twenty years later that the distributions of which we speak below gave a 
meaning to these calculations, and it was 1954 when the formulae in several 
variables of theoretical physics were at last justified — but not for nothing ... 
— by the Swiss mathematician Paul Méthée. 

Leaving differentiation aside for the moment, Dirac’s idea raises the ques- 
tion of whether one may approximate the value of a function f at a point, say 
x = 0, with the help of integrals involving f, for example by using functions 
Un(x) such that 


(27.1) iia / Apes F0) 


52 In this n° and in the following, we write a simple f for an integral extended over 
R. 

53 Laurent Schwartz, Un mathématicien aux prises avec le siécle (Paris, Odile Ja- 
cob, 1997), pp. 230-231 (trans. A Mathematician Grappling with his Century, 
Birkhauser, 2001). 


130 V — Differential and Integral Calculus 


for every “reasonable” function f. 

Since as yet we know how to integrate only regulated functions, we shall 
suppose in what follows that f and the u,, are such. To give a meaning to (1) 
for every “reasonable” function f — let us say bounded on R —, we impose on 
them the condition 


(D1) the up(x) are absolutely integrable on R, 


a superfluous condition if, as is often the case, the u, are zero outside a 
compact interval. 

The most “reasonable” functions being the constants, we must conse- 
quently impose on the up, the condition 


(D 2) tim un(c)de = 


if we want (1) to hold. Then trivially 


(27.2) f(0) = tim f(0)up (2) de 


so, to obtain (1), we are led to examine the difference 
(27.3) fue — f(0)Jun(x)dx = / : if 7 (r > 0). 


Dirac’s idea, that “almost all” the mass of the measure u,(x)dx is concen- 
trated on a neighbourhood of the origin for n large, leads us to introduce the 
condition 


(D 3) for any r > 0, 


(27.4) lim |un(x)|dx = 0. 
WP? F\2|2r 
If this is the case, and if, as always, one writes || || for the uniform norm of f 
on R, the second integral on the right hand side of (3) is majorised by 2|| fle 
for n large. 
Assume now that f is continuous at the origin, and consider, in (3), the 
integral over the interval || <r. For « > 0 given one may choose r so that 


(27.5) If(x)- FO] <e — for |x| <r. 


The integral is then, in absolute value, majorised by ¢ [ |u,(2)|dx where one 
integrates over |x| <r and a fortiori if one integrates over all R. To be sure 
that the result is arbitrarily small, it is thus enough to assume that 


(D 4) sup | |un(x)|dx = M < +00, 
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a superfluous condition, by (D 2), if the u, are positive. 

To recapitulate: « > 0 being given, one chooses r > 0 to ensure (5); on 
the right hand side of (3) the first integral is then < Me for any n, and since, 
for r given, the second tends to 0 as n increases, by (4), we see finally that 
the integral (3) tends to 0, whence (1), in view of (2). 

A sequence of regulated functions satisfying the conditions (D 1) to (D 4) 
is called a Dirac sequence on R. We have established the following result: 


Dirac’s lemma. Let (u,,) be a Dirac sequence. For every regulated function 
f that is defined and bounded on R, and is continuous at the origin, we have 


(27.6) f(0) = lim f f(e)un (xa. 


In practice, the Dirac sequences that one uses often satisfy more restrictive 
conditions, namely 


(i) Up is positive; 
(ii) for every r > 0, un is zero outside [—r,r] for n large; 
(iii) the integral of un is equal to 1 for every n. 


Conditions (i) and (iii) imply properties (D 1), (D 2) and (D 4) imposed in 
the general case, and (ii) implies (D 3). Condition (ii) is not satisfied in some 
important cases, as is shown by the exponential functions that we shall use 
in the following n°. 

If the up satisfy (i), (ii) and (iii), if, more generally, they are all zero for 
|x| > A, it is unnecessary to assume f bounded in the above statement since 
nothing changes if one replaces f(a) by 0 for |a| > A. 


Example 1. Consider on R a function u(a) which is regulated, positive, with 
total integral 1, and put u,(x) = nu(nx). The condition (D 1) is satisfied, 
also (D2) (change of variable nz = y in the integral) and condition (D 3) is 


satisfied because 
i Un(x)dx =, u(a)da, 
|z|>r |2|>nr 


a result which tends to 0 for every r > 0 since u is integrable on R. Conse- 
quently, 


(27.7) f(0) = limn [ f(2)u(ne)de 
for every regulated function f which is continuous at the origin. 
If uw is of compact support the function u,(z) = nu(nx) is zero for 


|x| > A/n, i.e. outside an ever-shrinking interval with centre 0; the factor 
n in its definition shows that, on the other hand, it takes very large values on 
a neighbourhood of 0, an indispensable condition if its integral is to remain 
equal to 1. The lemma which we have generously attributed to Dirac shows 
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-I/n 0 I/n 


Fig. 11. 


that in a certain sense Dirac’s 6 “function” is the “limit” of the functions 
Un(x); indeed, if u is zero for |x| > A, then 


lim up (x) = 0 for c £0 


since Uy, is zero outside [—A/n, A/n]. If one has chosen u so that u(0) > 0 it 
is also clear that u,,(0) = nu(0) increases indefinitely. 

Most authors choose ultraregular positive functions for the uy, with pretty 
bell-shaped graphs symmetric with respect to the origin, growing higher and 
higher, and whose base shrinks more and more, so that the area contained 
between the graph and the x axis remains equal to 1. One may, for example, 
choose un(x) = ¢,(1 — x7)” for |x| < 1, = 0 for |z| > 1, the constant cp, 
being chosen so that fu, = 1; the method of Example 1 would lead to the 
functions un(x) = cn(1 — x?/n?) for |x| < 1/n, = 0 elsewhere, with c = 3/4. 
One often also chooses un(a) = Cyn exp(—nx?), with the appropriate choice 
of cn; it would be better to take 


Un(x) = nexp(—1n? 2”) 


since the integral over R of the function exp(—7zx?) is equal to 1, as we 
shall see later; these functions do not have compact support but nevertheless 
form Dirac sequences, and, up to notation, had already appeared, not only in 
Dirac but also, a half-century earlier, in Weierstrass, in the proof of his the- 
orem on approximation by polynomials (following n°). In fact, none of this 
is necessary because Dirac’s lemma, which assumes nothing as to the “ele- 
mentary”, “classical”, or other nature of the u,, generalises to all measures 
defined over all locally compact spaces, i.e. to situations where polynomi- 
als, exponentials and other curiosities of the real line are unknown. Let us 
add that if the reader were to restrict himself to proving Dirac’s lemma for 
the functions nexp(—7n?x?) for example, he might possibly be tempted to 
perform explicit calculations offering no benefit other than the risk of error. 


The most important consequence of Dirac’s lemma is provided by the 
following statement: 
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Theorem 26. Let (un) be a Dirac sequence. For every function f defined 
and continuous on R one has 


(27.8) f(a) = lim / fle —y)un(y)dy 


uniformly on every compact subset of R if f is bounded or just if the Un 
vanish outside the same compact set. 


To establish (8) for f bounded it is enough to apply the lemma to the 
function y +> f(a—y). The little calculation in Dirac’s lemma shows moreover 
that 


(27.9) fe) [oscray— [ He -v)untypas] < 
< / IF(0) — fle —y)| lemn(w)| ay 


and since the integral of u, tends to 1, we reduce to proving that the right 
hand side converges uniformly to 0 when x remains in a compact subset Kr 
of R. 

Take an € > 0. Since f is uniformly continuous on every compact K in R 
there is an r > 0 such that |f(a — y) — f(x)| < « for « € K and |y| <r. By 
virtue of property (D 4) of Dirac sequences, the contribution of the interval 
ly| <r is thus < Me for any n and ze K. 

For such a choice of r the contribution of the set |y| > r to the right hand 
side of (9) is less than the product of 2||f|| by the integral of |u,| extended 
over |y| > 1; for n large it is therefore < € for any x € R, by (D 3). 

Finally one obtains an estimate for (9) valid for all the x € K simultane- 
ously, whence the theorem in this case. 

If the uw, are all zero for |y| > A, the first part of the argument survives 
unchanged. In the second, one remarks that the contribution of the set |y| > r 
is in fact an integral over r < |y| < A; if « remains in the compact |z| < B, 
the integral (9) involves only the pairs (2, y) satisfying |z| < B, |y| < A+B, 
a set on which the difference | f(x) — f(a — y)| is bounded, which allows one 
to conclude as in the preceding case. 


The interest of Theorem 26 is that it allows one to approximate the func- 
tion f by C® functions, or even polynomials, much more “regular” than 
itself. The idea had already been met in 1926 chez the American Norbert 
Wiener, the future inventor of “cybernetics”, whom Dirac no doubt had not 
read. 

In the first case, one remarks for a start, that thanks to the function 


u(x) =exp(—l1/x) forx>0, =0 forx <0, 


which is C°® on R as we saw a long time ago, there are “many” C®™ func- 
tions of compact support’ on R: to obtain one it suffices to multiply a C™ 


54 Recall that the “support” of a function f is the smallest closed set outside which 
it is zero, i.e. the closure of the set {f 4 O}. 
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function vanishing for 7 < a by a C° function zero vanishing for 7 > b. The 
set of these functions is a vector space which, since Schwartz, has been de- 
noted D(R) or simply D. There are clearly even positive C™ functions, zero 
outside arbitrarily given intervals; on dividing by their integral over R one 
may assume that the integral is equal to 1. Thus there exist Dirac sequences 
formed of functions y, € D. Using the function u(x) above, one may for 
example take y, (a) = c,u(#+1/n)u(1/n — x) with a constant c, such that 
fealeyde= 1, 

The functions (8) which, for every continuous function f on R, converge 
to f, are then C™, as we shall see. It all amounts to showing that for every 
yp € D zero for |x| > A and every continuous function f, the convolution 
product 


A 
(27.10) fxyla y= fi (x — y)y(y)dy = I. f(x — y)ely)dy 


is C%; this is the method of regularising an “arbitrary” function; it even 
provides a C™ result for every regulated function f on R. 
We remark that the change of variable x — y = t transforms (10) into 


(27.10) fx pla )= [ ro (a — t)dt = [oe Ofdt = 9+ He) 


(one has dy = —dt, but the integral changes orientation and the factor —1 is 
eliminated on reestablishing the natural orientation). To check that the inte- 
gral is a C™ function of x one may restrict to a compact interval J = [—A, A]. 
Since y(x—t) is zero for |x—t| > B, because ¢ is of compact support, the inte- 
gral (10’) is, for every x € J, extended over the interval K = [—A-—B, A+B]. 
We are then in the situation of Theorem 9 of n° 9: the function y(x—t) plays 
the réle of the function f(x,y) of the theorem and f(t) that of p. 

The convolution product is therefore differentiable, so, returning to (10), 


(27.11) (fxyv/=fxy’ or Difxy)=fxDe. 


Since the derivative Dy of a function in D is again in D, one may iterate 
(11), which leads to the general relation 


(27.12) (fxg) = fre or D'(fxy) =fxD 


hypothesising only that f is regulated on R. It is not the differentiability or 
the continuity of f which matters, it is that of y. 


Theorem 27. For every function f defined and continuous on R there exists 
a sequence fn of C™® functions which converges to f uniformly on every 
compact subset of R. If f is of compact support one may assume that the fn 
are zero outside a fired compact set. 
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Obvious: one applies Theorem 26 to a Dirac sequence formed by functions 
of D, bearing in mind what we want to establish. If f is zero for |a| > A and 
if one assumes, for example, that the y, vanish for |x| > 1/n, it is clear that 
the fr are all zero for |x| > A+ 1, qed. 

If the function f is C', one can, in formula (10), differentiate directly 
with respect to x, again thanks to Theorem 9 of n° 9, which shows that 


(27.13) D(fxy) =Dfxe 


in this case. Replacing y by the y, € D of a Dirac sequence one sees that 
the Df, converge to Df uniformly on every compact set. Therefore: 


Corollary. Let f be a function of class C? (p < +00) on R. There exists a 
sequence fr, of C~ functions such that 


Him fi. () = f(a) 
uniformly on every compact for every finite r < p. 


All these results extend, with the same proofs, to functions defined on 
R?. 


28 — Approximation by polynomials 


We shall now prove Weierstrass’ theorem on the uniform approximation of 
continuous functions by polynomials on a compact set. The proof we shall 
give of this — Weierstrass’, up to a few details — also uses approximation by 
convolution products. 

(i) We start from a function u which is positive, integrable, has integral 1 
over R, but is not zero for |x| large because we shall choose for u an every- 
where convergent power series. For every function f(a) which is continuous 
on R and zero for |x| > A given, we put 


(28.1) fala) =m f F(y)u(ne — nya 


These functions being the convolution products of f by the functions 
Un(x) = nu(na), which form a Dirac sequence, the f, converge to f uni- 
formly on every compact set. 

(ii) We assume that u(x) = )Capz? is the sum of a power series that 
converges for every x € C; then 


fala)en ip Hu)dy Sr apn? (a — y)?. 


For x and n given, the power series in 7 — y converges normally on every 
compact set, and in particular on the interval |y| < A outside which f(y) = 0. 
Since f is bounded we may integrate term-by-term, whence 
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(28.2) falar) = ST nayn? / (a — 9)? f(y)dy = So fap), 


where we integrate over |y| < A. For |x| < A we have |x — y| < 2A, whence 
(28.3) |fn.p(x)| < 2An| fll lap|(2n.A)?, 


the factor 2A coming from the integration over |y| < A. Since the power 
series u(z) converges for any x, so for example at the point 2nA, the right 
hand side of (3) is the general term of a convergent series. The series (2) 
therefore converges normally on |x| < A. 

(iii) The general term of the series (2) is a polynomial in x, as we see 
on expanding (x — y)?. Since it converges normally on |a| < A, its partial 
sums converge to f, uniformly on this interval. Since the f,, converge to 
f uniformly in |a| < A, by (i), we can, replacing them by partial sums of 
sufficiently high order, obtain a sequence of polynomials converging to f 
uniformly on |a| < A. 

(iv) We still have to show the existence of u. The function 
u(x) = c.exp(—72”) meets the requirements. It is clearly positive, integrable, 
has total integral 1 for c suitably chosen (c = 1, in fact) and is expandable 
as an everywhere convergent power series. 


Theorem 28 (Weierstrass, 1885). Let f be a real function defined and 
continuous on a compact interval K C R. Then there exists a sequence of 
polynomials which converges to f uniformly on K. 


It is enough to observe that f can be extended to a continuous function 
on all R, zero for |z| large: complete the graph of f by linear functions. 


For a function f defined and continuous on all R, or, more generally, on 
an unbounded interval J, it is impossible to find a sequence of polynomials 
Pn Which converges uniformly to f on I except in the trivial case where f 
is itself a polynomial (Chap. III, n° 5, end). As we then noted, it is always 
possible to demand that the p, should converge to f uniformly on every 
compact K C I. 

If the interval I is bounded but not compact, we have seen in n° 8, as 
a consequence of Corollary 2 of the uniform continuity theorem, that ap- 
proximation by polynomials is possible only if f is uniformly continuous on 
I; f then extends to be a continuous function on the compact interval ob- 
tained by adjoining the endpoints to J, and Weierstrass’ theorem applied to 
this compact interval yields the result. In short, Weierstrass’ theorem is best 
possible. Having said this, there are more difficult approximation theorems, 
where, instead of considering polynomials, one considers for example, lin- 
ear combinations of exponential functions exp(a,x) with given a, (suitably 
chosen ...). 

Up to details, the proof of Weierstrass’ theorem which we have presented 
is that of Weierstrass himself; his aim was to show that, though it is certainly 
impossible to represent every continuous function by simple analytic formulae 
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(algebraic, power series, etc.), it is however possible to find series of simple 
functions — as it happens, polynomials and not only monomials — which 
represent them. 

There are many other proofs of the theorem; Hairer and Wanner, Analysis 
by Its History, p. 264, cited a dozen (from G. Meinardus, 1964), the latest 
dating from 1934. This leads me to suspect, without having checked, that 
perhaps one should add to their list the one and only model proof, applicable 
in the much more general framework of compact topological spaces, namely 
the Stone-Weierstrass theorem, a generalisation obtained in the 1930s by a 
Chicago mathematician who notably wrote in this period the first system- 
atic exposition of the theory of “abstract” Hilbert spaces; he did much after 
1945 to invite or recruit foreign mathematicians — I benefitted in 1950 — and 
moreover had a pronounced taste for gastronomy in general and French in 
particular; this was a good sign in an American, but one has to say that he 
was the son of a Chief Justice of the Supreme Court and not of a corn farmer 
from the Bible Belt. The theorem is as follows. Assume given on a compact 
space X, (Appendix to Chap. III, n° 7) a set A — the initial letter of the 
word “algebra” — of continuous functions with complex values satisfying the 
following conditions: 


(a) the complex constants are in A; 

(b) the sum and the product of two functions of A are again in A; 

(c) for every f € A, the complex conjugate function f is in A; 

(d) for any distinct points z,y € X there exists an f € A such that 


f(x) F f(y). 


Then every continuous function on X is the uniform limit on X of a sequence 
of functions f, € A. For a proof with hardly any calculations, see Dieudonné, 
Vol. 1, Chap. VII, n° 3, or Serge Lang, Analysis I (Addison-Wesley, 1968), 
Chap. VIII, n° 5. If X is a compact subset of C, one may take for A the set of 
polynomials in « and y [but not just the polynomials in z: they do not satisfy 
condition (c), without mentioning the fact that, in an open subset of C, a 
uniform limit of polynomials in z is holomorphic, as we shall see], whence 
Weierstrass’ theorem for two variables, or for p variables on taking X in R?. 

In particular let us take for X the circle |z| = 1, the set of complex 
numbers of the form z = e?"" with t € R defined modulo Z. A function f 
defined and continuous on X is transformed into a function g(t) = f(e?7) 
defined, continuous and of period 1 on R, and vice-versa as is easy to see — 
one has only to check continuity. On X, a polynomial in x = (z+ Z)/2 and 
y = (z— Z)/2i, ie. in z and 2Z, clearly reduces to a finite sum of the form 
So ane?™™ with the n € Z, in other words to a trigonometric polynomial 
(Chap. III, n° 5). Corollary: every function defined, continuous and _ peri- 
odic on R is the uniform limit of trigonometric polynomials, as announced in 
Chap. III, n° 5. This is the result on which one may base the whole theory 
of Fourier series. One may also, we shall see this in the chapter dedicated to 
the subject, prove it by explicit elementary calculations, as did Weierstrass 
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himself. But Stone’s method applies to generalisations of Fourier series (har- 
monic analysis on compact groups for example) where direct and explicit 
calculations would be impossible. Moreover, in this way one has no need of 
providential functions like exp(—72?). 

This kind of “nonconstructive” proof naturally does not satisfy the calcu- 
lators. There is a proof with calculations in Hairer and Wanner, pp. 264-268, 
and, moreover, the graphs showing the approximations they provide. 


29 — Functions having given derivatives at a point 


To end this section with a nonobvious theoretical exercise we shall prove the 
following result®?: 


Theorem 29 (Emile Borel, 1895). For every sequence (an) of complex 
numbers there exists an indefinitely differentiable function f of compact sup- 
port on R such that f\")(0) = an for everyn EN. 


Our first move, faced by this theorem, is to put 


(29.1) f(a) = > anx”/n! 


in accordance with Maclaurin’s formula. Bad idea: the series has every chance 
of diverging for x # 0. 

All the same, (1) contains the germ of an idea for a proof. The function 
anx" /n) = ayz'”"] has the virtue that, at the origin, all its derivatives are zero 
except for the n-th, which is equal to a,. We shall replace it by a function 
fn € D, the space of C'°° functions of compact support on R, possessing 
the same properties but making the series converge. And since we will have 
to calculate the successive derivatives of the series it will be necessary to 
differentiate it term-by-term, i.e. to apply Theorem 20 of Chap. III, n° 17. 
In other words, the function f will be given by the formula 


(29.2) f(z) =o fala) 
where the f,, € D satisfy, for example, 

(29.3) fn(x) =0 for ja| >1 
to yield a result of compact support, satisfy also 


°° The proof which follows (H. Mirkil, 1956) develops the one that one finds for ex- 
ample in Lars Hérmander, The Analysis of Linear Partial Differential Operators 
(Vol. 1, Springer-Verlag, 1983, p. 16), where it takes sixteen lines. Borel’s com- 
plete result is much stronger but requires difficult results on analytic functions: 
one may assume f to be of class C™ on a neighbourhood of 0 and real-analytic 
apart from at the origin. See Remmert, Funktionentheorie IT, p. 237. 
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r=n 


224) n(0) < { ‘ 7 rén 


and finally, and this is the crucial point, are chosen so that, for every r > 0, 
the series of the derivatives }~* FL (2) converges uniformly on |z| < 1 [or on 
R, which comes to the same by (3)]. The proof divides into several parts. 


(i) Construction of fo or, equivalently, of a function h which is zero for 
|x| > 1, equal to 1 at « = 0 and all of whose derivatives are zero at the origin. 
We start with a function of the type 


(29.5) ate) ={ 1 for |x|<A 


0 for |r|>A 


and “regularise” it with the help of a convolution product 


(29.6) h(a) = / ple —waly)dy 


with a y € D, positive, of total integral equal to 1, and zero for || > C, 
where C will be chosen later. The integral is taken over the interval |y| < A 
and cannot be ¥ 0 for a given x unless there exists a y satisfying |y| < A and 
|z — y| < C simultaneously, which requires |z| < A+ C. Then 


nia) = [ ep i Ne 


—A 2—A 


The support [—C,C] of y is contained in [x — A,a + A] so long asa—A< 
—C<C<a+A,ie. C—-A<a<C+4A; choosing A = 3/4 and C = 1/4, 
one sees then that the function h is zero for |x| > 1 and equal to f y(z)dz = 1 
for |z| < 1/2. Its graph is of the type below (fig. 12). 


(ii) Choice of the fn. We put 
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(29.7) fr(v) = h(bpx)anal™, 


with the b, > 0 to be chosen later. Since we want the f, to be zero for 
|x| > 1, it is prudent to impose the condition 


(29.8) bed 


on them. Let us calculate the derivatives of the f, at the origin. They are 
obtained from Leibniz’ formula: (fg) = Sv fl’-Plgll. At 2 = 0 all the 
derivatives of x!"] are zero except the n-th, equal to 1. Those of h(b,x) are all 
zero at the origin starting from the first. The derivative of order r of fn cannot 
be # 0 unless there exists a p such that p =n and p= r simultaneously, in 
other words if r = n. In this case, there remains f£(0) = ay, So that the f, 
do satisfy condition (4), and this whatever the by. 

(iii) Convergence of the series Se AL (a). We shall show that one may 
choose the b,, so that 


(29.9) |f°(@)| < 1/2” for every x, every r, and every n> 1, 


the sole significance of the numbers 1/2” being that they form a convergent 
series. As this is the case, it is clear that the series > f,(x) will be normally 
convergent as will be all the derived series, the sum of the series will therefore 
be in D and its derivatives, calculated by differentiating the series term-by- 
term, will be the a, at the origin, by the relations (4). 

It remains to choose the b,. By (7) and Leibniz, we have 


(29.10) (a) = S 2hO=?)6, ab Panel 2 


with numerical coefficients denoted ? and whose exact values are of little 
importance. Since h(b,7) = 0 for |x| > 1/by, it suffices, to evaluate the 
result, to work on the interval |z| < 1/b,, which allows us to estimate the 
monomials appearing in (10). In this interval we have |b) -?a"—?| < br”, an 
expression independent of p, whence, passing to the uniform norms, 


(29.11) Ihe 


SB and S22 fal] = Ma nbh” S Mob 


for r <n. Since, for n given, the conditions (9) to be satisfied involve only the 
r <n, so are finite in number, it then suffices, to satisfy them simultaneously, 
to choose My.n/bn < 1/2” for every r <n, ice. 
by, > max 2”M,n, 
O0<r<n 
qed. 

This proof is typical of the current techniques in analysis. All the work 
consists of rigorously controlling the orders of magnitude of the numbers or 
functions that one is manipulating. Nothing is calculated explicitly. We are 
at the antipodes of the analysis of the Founders. 
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30 — Radon measures on a compact set 


As we have already observed on several occasions, many of the results of 
integration theory use only a few quasi-algebraic properties of the integral 
m(f) of a function: linearity, positivity and continuity with respect to uniform 
convergence, in other words, Theorem 1 of n° 2. 

We have also observed on occasion that there are curious analogies be- 
tween integrals and series. Let us work on a compact interval of R and con- 
sider, for example, the two following situations: 

(i) Theorem 9 of n° 9 and Theorem 24 of n° 25 which allow one to calculate 
the derivative of a “continuous” sum 


(30.1) v(y) = : fae, y) pan) 
of functions by the formula 
(30.2) v'(u) = | Dof (xe, y)u(a)der, 


(ii) Theorem 19 of Chap. III, n° 17 which, translated into the language 
of series, allows one to differentiate a “discrete” sum 


(30.3) ey) = >) fly) 
of functions by the formula 
(30.4) '(y) = >= fry). 


The analogy would be even clearer if, starting from a finite or denumerable 
set D of points of R, a scalar function p(€) on D satisfying > |u(§)| < +00, 
and a function f(&,y) defined on D x Y, one put 


(30.3’) o(y) = >> f(E yu) 


when one would have 


(30.4’) ¢'(y) = © Daf (E,y)ulé) 


of course under suitable hypotheses as in the case (i). 

One may unify the two cases formally by writing, for every reasonable 
function f of a “continuous” or “discrete” variable, u(f) = f f(x)u(x)dax 
in the first case and u(f) = >> f(€)u(E) in the second; using the notation 
fy (a) = f(x,y) one then has 


gly) =e fy) and ¢'(y)=h[(Def)y] 
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in both cases. 

We are led to introduce, in a general way, functions f +> p(f) in which 
the variable f is a more or less arbitrary function on a given interval X (or 
a metric space, or even, in the “abstract” theory of integration, an arbitrary 
set) possessing properties formally analogous to those of integrals and series. 
Clearly one has to impose some restrictions on the category of functions f 
considered: it is not possible to define the expression [ f(x)dz in a natural 
way for every function f on R. On the other hand, the problem, whether 
dealing with series or integrals, has always been the following: one is given 
u(f) for particularly simple functions f (finite sums in the discrete case, step 
functions in the continuous case) and one hopes to extend the construction 
in a natural way to more complicated functions (series in the discrete case, 
integrals of regulated or semicontinuous functions, or even more general in 
the Lebesgue theory, in the case of continuous sums). 


In the simplest case, of a compact interval K C R, the constructions in 
n° 1 and 2 of this Chapter led us to associate to each step function y a num- 
ber (vy) possessing the following properties: 


(i) linearity: play + By) = an(y) + Bu(w) for any constants a and 3 
and step functions y and y; 


(ii) continuity with respect to uniform convergence: there exists a constant 
M() > 0 such that 


(30.5) lu(e)| < M(H) |lellx 


for any y. If, for every interval I C K, we write x7 for the function equal to 
1 on J and to 0 in k — J, we may then associate a “measure” 


w(Z) = w(x1) 
to I, which manifestly has the additivity property (M 2) of n° 1: 
T=],U...U[, = wd) = wh) +... + Un) 


if the I, are pairwise disjoint, because x; is then the sum of the characteristic 
functions of the J,. From this one can calculate the integral u(y) of every 
step function using a finite partition of K into intervals I; on which ¢ is 
constant; on choosing points €j € Iz, one has 


up) = 5° o(&)u(Te) 


since 
o(2) = > olée)xz, (2) 


for every « € K, whence the formula by the linearity of p> p(y). 
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Starting from this data one may then define y(f) for every regulated 
function f on Kk by passing to uniform limits: if the step functions y,, converge 
uniformly to f then the relation (5), which implies 


[H(Yp) — M(Yq)| < M(H) IlYp — Yallx; 


shows that the integrals u(y,,) form a Cauchy sequence, so converge; the limit 
depends only on f since, if (w,,) is another uniform approximation to f, the 
relation (5) shows that (Yn) — u(t») tends to 0. (Compare the construction 
of the real numbers starting from Cauchy sequences of rational numbers.) 
Whence u(f), with, quite clearly, the two usual properties of linearity and 
continuity: 


lM(f)| < M(u)IIfll«- 


Note in passing that this construction, which does not involve the “lower” 
and “upper” integrals of n° 1, but applies — no big deal! — only to regulated 
functions, does not use the hypothesis of positivity of 4, namely that 


(30.6) p> 0 => uly) = 0. 


If this is satisfied then the relation y < w implies u(y) < p(w) and all the 
arguments of n° 1 concerning the lower and upper integrals of a bounded func- 
tion apply unchanged. As we have observed since n° 1, the three conditions 
imposed on our functions (1) would be satisfied if one put u(I) = u(v)—pu(w) 
for I = (u,v), where ju(x) is an increasing function on Kk’. Note also that this 
construction mingles the discrete and the continuous sums: for example, put 


Oe I plae)u(x)ae + 2 oOel®) 


where one integrates over K and where 5° |c(£)| < +00; if y is the charac- 
teristic function of an interval J C K of any type, one finds clearly 


w(t) = f uaa + Yel6) 


Physically, this comes down to considering that one has, on the one hand, a 
distribution of masses on K whose density in the usual sense (the ratio of the 
mass of an “infinitely small” segment of K to its length) is given by the regu- 
lated function ju(a) and, on the other hand, a countable set of “point” masses 
c(€). One also meets this kind of situation in the most modern physics: in the 
spectrum of the radiation emitted by the Sun, there are “bands” , whose inten- 
sity is a continuous function of the frequency, and “lines” which concentrate 
a nonzero intensity on an interval consisting of a single frequency. Nothing 
very artificial here; Newton would have said that one meets this in Nature ... 
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In elementary practice one is above all interested in integrating continu- 
ous functions; when, in the theorems on differentiation under the f sign for 
example, we have introduced an arbitrary regulated function p(x) in front of 
the symbol dx of Lebesgue measure 


froom(t)= fi peas 


on an interval X, this was not for the pleasure of integrating discontinuous 
functions; it was in order to obtain a theorem applicable to the measure 


fro nlf) =f fe)ua)de. 


There are other more technical reasons to think that in a “good” integration 
theory the starting point is not the measure p(J) of an arbitrary interval 
I Cc K, but rather the integral u(f) of an arbitrary continuous function 
f on K; anyway, and as we have just seen, the passage from measures of 
intervals to the integration of continuous (or even regulated) functions is 
quasi-instantaneous, once one has understood the construction of the classical 
integral. 


(p+1)/n 


Fig. 13. 


The real problem, solved by Lebesgue, is to integrate functions much more 
general than the continuous or regulated functions (start with the semicon- 
tinuous functions). Chez Lebesgue, a century ago, one started by extending 
the concept of the measure of an interval to that of the measure of a much 
more complicated set EF C K, for example to the sets which are countable 
unions of countable intersections of countable unions of countable intersec- 
tions of open intervals (it can get even worse, but this is not important). 
Starting from this one integrates a function f — assumed bounded for sim- 
plicity — as follows: for every integer n > 1, consider for any p € Z the set 
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En» = {p/n < f(a) < p/n+1/n} on which f is equal to p/n to within 1/n; 
these sets form a finite partition of K; if they belong to the category of those 
for which one can define m(£) [or y(£) in the case of an arbitrary measure], 
one may consider the Lebesgue sum, >7,, m(Ep,n)p/n (as against the Riemann 
sum); geometrically, one considers the set of the points (2, y) in the plane ly- 
ing between K and the graph of f, cuts it by the lines y = p/n into horizontal 
slices having the E,,, as bases, and approximates the required area f[ f(«x)dx 
by the sum of the areas of these horizontal slices (fig. 13). Lebesgue’s genius 
was not just to have replaced the decomposition into vertical slices by decom- 
position into horizontal slices; it was to have understood that this innocent 
modification of the traditional procedure provided a method formidably more 
powerful than that of Riemann. It has been generalised ad libitum, but no 
one has ever progressed in a way that would be useful beyond certain ad hoc 
problems. The mode of exposition has only been modified from that chosen 
by Lebesgue in an age when the concepts of vector space, of linear form and 
of norm had not yet been isolated: analysis had been arithmetised and was 
now, in Germany rather more than in France, as rigorous as number theory, 
but it had not yet been algebraised; in fact, integration theory was probably 
the impetus that forced the analysts to learn, even to invent, what a vector 
space of infinite dimension ought to be, starting with Hilbert spaces. 


Since, for every compact set K C C (or in R?, or for every metric compact 
space), one has available the space®® L(K) = C°(K) of the scalar continuous 
functions on K and the norm 


Ilflla = sup |f()| 


of uniform convergence on K, one is led to define a Radon measure on K in 
the following way: it is a map 


pw: L(K)+>C 


which is linear in the general sense of algebra and continuous in the general 
sense of the theory of normed vector spaces: there exists a constant M(j) 
such that 


MCF) < M(u)|I fll 


for every f € L(K). Such a measure is said to be positive if 
f20= > u(f) 2 0. 


One shows without much difficulty (Chap. XI, n° 17, Theorem 29) that every 
measure /4 can be put in the form p( f) = 4 (f) —Me(f) +ius(f) —ima(f) with 


°° The notation L(X) was introduced in André Weil, L integration dans les groupes 
topologiques et ses applications (Paris, Hermann, 1940), a book from which many 
of my generation learned integration and generalised Fourier analysis. I assume 
that Weil chose it not only in homage to Lebesgue, but also because he composed 
directly onto the typewriter ... 
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positive measures jz. It is generally worth restricting oneself — or reducing 
— to the case of positive measures, by far the most important, and the only 
ones for which a grand theory has been constructed. 

Leibniz’ notation having amply proved its usefulness, one imitates it by 
writing 


ine I f(«)du(2), 


a notation which will justify itself better below, but which one may use 
without understanding its origin. So, by definition, we have the relation 


(30.7) [e sina — 5log x) du(x) = 3 f sinzdy(z) _ 5 f lowe dul) 


for any constants 3 and —5 and continuous functions sin and log on K, also 
that 


(30.8) [ seddutas) < MDs 
for every continuous function f on K, with a constant M(j,:) to be chosen 
as small as possible; this is the norm of the measure j4, notation ||/1||; for the 
usual measure m on an interval of R, we have ||m|| = m(k). The object of this 
condition is to guarantee that Theorem 4 concerning uniform limits applies 
again here: if a sequence of continuous functions f,, converges uniformly on 
K to a necessarily continuous limit f one has 


(30.9) MP) — wr) = [HE — fr S MIF falls, 


whence pu(f) = lim p(f,). If, likewise, a series s(x) = 5) un(x) of continuous 
functions on K converges normally, one may integrate term-by-term: 


(30.10) m (~~ un) = S- L(Un). 


In short, we have transformed Theorems 1 and 4 of § 1 into definitions. 


Example 1. Choose a function p(x), integrable (in the usual sense) on an 
interval K C R, and put 


(30.11) uf) = / f (ae)po( ar) 


for every f € L(K). Linearity is obvious and continuity follows from the 
inequalities 


lM(f)| < f |f(e)|u(a)lde < Willie. f \u(a)lae. 


Here ||| <_f |u(x)|da (and, in fact, we have equality). 
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The simplest case is obtained by choosing (x) = 1; this is the measure 
m which appears in all of classical analysis and has been the subject of 
this chapter. It is called the Lebesgue measure on K, not, of course, because 
poor Henri Lebesgue had discovered that the length of an interval (a,b) is 
equal to 6 — a, but because it is for this particularly important measure 
that he invented the grand integration theory that was later extended to all 
measures defined on all reasonable topological spaces. In the general case (11) 
one speaks of the measure of density (a) with respect to Lebesgue measure: 
obvious physical interpretation. 


Example 2. Choose a countable set D of points of K and, for every € € D,a 
number c(€) € C; assuming S> |c(€)| < +00 one may define 


(30.12) uf) = S_ e(€) FE) 


for every continuous function f on K, the series being taken over D. No 
hypothesis on the compact set K is necessary here. 


Example 3. Take K = Ax B where A and B are compact intervals in R and 
put 


m(f) = ff Fle.yaedy 


AxB 


for every continuous function f on K (n° 9, Theorem 10). 


Example 4. Choose for K the set T : |z| = 1 of complex numbers of modulus 1 
(unit circle); the “parametric representation” z = exp(27it) = e(t) of the 
points of T transforms any function f € L(T) into a continuous function 
f[e(t)] of period 1 on R. One thus obtains a very privileged measure on T by 
putting 


(30.13) m(f)= ff fletwlae= 5- f° reat 


for every f € L(T); this is what dominates the theory of Fourier series. One 
could clearly replace T by any other parametrised curve, but it is better to 
defer this type of example to when we shall need it (mainly line integrals of 
holomorphic functions). 


We shall see later how, in the case of an interval, one may construct all 
the measures on K by a procedure analogous to that of n° 1 and 2 concerning 
Lebesgue measure. 

We shall now show quickly how some of the theorems on Lebesgue mea- 
sure extend to Radon measures. 


Differentiation under the [ sign. Theorem 9 of n° 9 extends trivially (i.e. 
with the same proof) to the functions 
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(30.14) oy) = i, fle, vau(a), 


where we have not indicated that the integral is extended over K; f is a 
continuous function on K x J and J an interval of R. The result is a continuous 
function and it is of class C! if D2 f exists and is continuous, with, in this 
case, 


(30.15) gy) = / De f(a, y)du(2). 


To reproduce the proofs verbatim here would be to waste our time and that 
of the reader. The simplest case is that of an interval K C R, but in fact the 
argument and the result apply to every compact set K if one knows that, 
on every compact set (for example, here, K x H where H C J is a compact 
interval), a continuous function is uniformly continuous. 

Exercise. The function f * u(x) = f f(x — y)du(y) is C™ if f is C© and 
of compact support on R. 


Double integrals. A slightly less easy exercise consists of generalising The- 
orem 10 to measures: 


Theorem 30. Let K and H C R be two compact sets, and v measures on 
K and H, and f a continuous function on K x H. Then 


(0.16) ff aute) [Fedo = f avn f tte.san(. 


Inspired by the proof of Theorem 10 one is led, for a given r > 0, to take 
partitions (K,) of K and (H,) of H, and to compare each member of (16) to 
the sum of the analogous expressions obtained on replacing K and H by K, 
and Hy, ie. by multiplying f(x,y) by xp(x)@,(y) where yp and 0, are the 
characteristic functions of K, and H,. Since 


(30.17) So xp(z)=1 on K, $2 6q(y) =1 on A, 


we have f(x,y) = )¢ f(x, y)Xp(@)Oqg(y) on K x H, which explains why the 
sum of the integrals over the products Ky x Hg is the integral of f over K x H. 

This method is not directly applicable here: for an arbitrary measure we 
don’t yet know how to integrate discontinuous functions. The solution is to 
replace the discontinuous functions which frustrate us by continuous positive 
functions k, and hg on K and H still satisfying (17) and zero outside sets 
A, or Bg which are small enough that the function f is constant to within r 
on each product A, x By. By (17) and the linearity of 4 and v, we will then 
have 


(vos) f dy(x) f see.y)anty) = 
= 3 | cute) f Hee.kylorholwddety) 
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this is all meaningful, and we can imitate the proof of Theorem 10. 
Let us be more precise. Choose an r > 0 and an r’ > 0 such that (uniform 
continuity: n° 8) 


(30.19) (Ja’—a"| <r’) & (ly -y"| Sr) = fey) — fay) <r. 


Cover K by a finite number of closed sets F,, of diameters®’ < r’/2 (we 
are not dealing with a partition) and, for each p, let us choose an open set 
U, > F, of diameter < r’; we may, but there is no point, assume that these 
sets are intervals. For every p there exists a continuous positive function 
Y~p on K which is strictly positive on F, and zero outside U,, for example 
p(x) = d(x, K —U, 1K), the distance from the point x to the complement 
of U, 1K in K, closed and disjoint from F,. The continuous function k(x) = 
> pp(z) being > 0 at all  € K, since the F, cover K, we need only put 
ky(x) = Yp(x)/k(ax) to obtain the functions we seek. 


+ U,-~ = U;—~— 


Fig. 14. 


Similarly we construct continuous positive functions hg on H, with sum 1 
and zero outside open sets V, of diameters < r’. Finally, we choose points 
&) € Ap = Up NK and nq € By = Vy NA. 

In the general term of the sum (18) the integrand cannot be ¥ 0 at a point 
(x,y) unless g,(x) and hg(y) are so, i.e. if (x,y) € Ap x By. Then |x—&| <r’ 
and |y — | <r’ and thus, by (19), 


(30.20) | f(z, y)kp(x)hq(y) . f(&p, Nag )kp(@)hg(y)| < rkp(x)hq(y). 


In fact, this inequality is valid for any (7,y) € K x H since outside A, x 
B, either kp(x) = 0 or hg(y) = 0. On adding the inequalities (20) and 
remembering that 5° kp(x)hq(y) = 1 one finds 


|F(x,¥) — D2 FE na) hp(e)rgly)| <r 


for any x and y. Put g(x,y) = 30 f(g) kp(@)hq(y); then 


(30.21) lf —gllkxu Sr. 


°7 The diameter of a set X C C is the number sup d(x, y) where 2, y vary in X. 
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Denote the two sides of (16) by A’(f) and X’(f). These are measures 
on K x H: linearity is obvious, and continuity follows from |X’(f)| < 
M(uw)M(v)||fllaxa, with the same inequality for \”. By (21), we then have 
N(f) = N(g) and A”"(f) = X"(g) to within M(p)M(v)r. Since r > 0 is 
arbitrary it is then enough to show that \’(g) = ’(g) to establish (16). 

But this is obvious: for any k € L(K) and h € L(H) we have 


[evte) feeayntanty) = fautayr(o) [ rarty) = 
= f utzyr(x)o(h) = v(h) | (w)dqu(e) = w()o(h) 


and the same calculation, with the same result, on interchanging the order 
of the integrations; since g is a linear combination of functions of the form 
k(x)h(y) we have A’(g) = X""(g), ged. 

This proof generalises fully: in every metric space one may find systems 
of continuous positive functions satisfying (17) and zero outside arbitrarily 
small given open sets; such systems of functions are called partitions of unity. 
The method applies to triple, quadruple, integrals etc. 

The two essential points in the proof are that (i) the identity (16) is 
obvious if f(x,y) is a finite sum of functions of the form g(x)h(y), (ii) every 
f € L(Kx AH) is, by (21), the limit of such functions, uniformly on K x H. The 
general Stone-Weierstrass theorem provides (ii) without the least calculation: 
it is enough to check that the set A of functions of the form )°> gp(x)hq(y), 
with g, € L(x) and h, € L(H) complex-valued, satisfies the conditions (a), 
(b), (c), (d) of n° 28; a very easy exercise. 

Finally one has to observe that the map 


poan= ff Henduleyarty) 


— it is now unnecessary to specify the order of the integrations — is a mea- 
sure on the compact set K x H, the product measure or Cartesian product 
of w and v. On choosing du(a) = dx and dv(y) = dy one recovers Lebesgue 
measure in the plane. 


All the theory expounded in n°s 10 and 11 extends to general positive 
measures, with the very same proofs: we have established what we need and 
only have to replace m(f) everywhere by pu(f) and m(i) by M(z) or ||u]]. 


31 — Measures on a locally compact set 


Since Lebesgue measure allows one to integrate over intervals which are nei- 
ther compact nor even bounded we should be able to extend the definition of 
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Radon measures to this case. First we examine the classical situation more 
closely, it being a little less simple than the compact case. 

If X = (a,b) C R is an arbitrary interval one cannot define the integral 
J f(x)dx over X for just any continuous function f on X: there must be 
convergence conditions at the endpoints of X if these do not belong to X or 
are infinite. A radical method to eliminate them is to assume the function 
f(a) to be zero when «x is close enough to a or }, i.e. that there exists a 
compact®® interval K = [u,v] C X such that f(x) =0 for 2 ¢ K. Then 


(31.1) f(a)dx = i f(a)dx for K CK'CX, 
K! K 


so that the integral taken over X converges absolutely for a trivial reason 
and reduces to the integral over any compact set outside which f is zero. 
The functions of this kind are called of compact support on X (n° 27); the 
set of these is a vector space which we again denote by L(X) and which many 
other authors denote C°(X), with a suffix c whose significance is obvious. For 
every compact K C X the set of f € L(X) which vanish outside K is a vector 
subspace L(X, K) of L(X). 

In this way Lebesgue measure on X gives us a (positive) linear form 
ft m(f) on L(X). We have a norm on L(X) 


(31.2) Ifllx = sup |f(2)], 
2EX 


but if X is not bounded the linear form m is not continuous relative to this 
norm, in other words, there is no finite constant M such that 


i: " fla)de 


for every continuous function f on X which is zero outside some compact 
interval K = [u,v] C X and otherwise arbitrary. For example take X = 
]0, +o0[, with 0 < u < uv < +00; you can clearly find a function f with 
values everywhere between 0 and 1, equal to 1 on K and zero outside a 
compact K’ C X a little larger than K (make the characteristic function of 
K continuous by replacing its discontinuities at u and v by line segments); for 
such a function the left hand side of the preceding inequality is > m(K) = 
v —u and the right hand side is equal to M; impossible if u and v can take 
arbitrary values between 0 and +oo. For X = [0,-+o0[, one may take u = 0, 
but the difficulty remains. 
Failing (3), one may all the same observe that 


(31.3) Im(f)| = 


< M. sup |f(x)| = M||f\llx 
rEX 


58 If a is finite one must have a < u if a € X,a < u if not; if b is finite similarly 
one must have b > v if be X, b > v if not. In the case where a or 6 is infinite 
both u and v must be finite. If for example X = [0, +00], a continuous function 
on X is so in particular at 0, so that the only difficulty in integrating it over X 
comes from the other endpoint of X. 


152 V — Differential and Integral Calculus 


(31.4) Im(f)| < m(K)\|fllx for every f € L(X, K) 


since in fact the integral over X involves only the compact K C X off which 
f is zero. This is the result we can generalise. 

We shall call a Radon measure on X any linear form jz on the vector space 
L(X) having the following property: for every compact K C X there exists 
a constant Mx (1) such that 


(31.5) eA] < Mx (H)IIF lx 


for every f € L(X,K). In other words, we assume that the linear form pu 
is continuous on each subspace L(X,K), though not necessarily on all of 
L(X) =UL(X, kK). 


Example 1. Take for X the open interval |0,-++co[ and 


+00 
(31.6) wf) = | f(a)defe. 


There is no problem with convergence for f € L(X) since f(x) is zero on a 
neighbourhood of 0 and for x large. If f is zero outside K = [u,v] then 


Iu(f)| < og v — log u)||fllx, 


whence continuity. 
One could clearly replace the function 1/x by any regulated function p 
on X; if the function p is absolutely integrable in X then 


MA) < M(u)IIfllx 


where M() = f |p(x)|dx no longer depends on the support K of f. In this 
particular case the linear form py is continuous on L(X) and not just on the 
subspaces L(X, K). A measure possessing this property is said to be bounded 
or of finite total mass on X. 


Example 2. If one replaces the open interval ]0,+00[ by the closed interval 
(0, +co[ the formula (6) is no longer meaningful, since, in this case, a function 
f € L(X) is required to be zero for x large but not on a neighbourhood of 0, 
so allowing every chance of making the integral (6) divergent. 

But one can replace 1/x by a function that poses no problem at the origin, 
and, for example, put 


+00 

(31.7) = i: A utde. wih tele So 
0 

If f is zero outside K = [0,v] then 


uf)| < Mx (W|I fllx 
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yre(s)+ 
w= ff |a*|da = a +1 


One obtains a bounded measure on replacing x° by an absolutely integrable 
function on X, for example x*e~* with Re(s) > —1. 


with 


Example 3. Choose a compact interval K C X, a measure yp on K and con- 
sider the linear form f + f f(x)du(x), where one integrates over K, so 
involving only the values of f on this fixed compact set. This example shows 
that a measure on K may also be considered as a measure on X: all the mass 
is supported by K 


Example 4. For X = R put 


(31.8) u(t) = 5° f(r) 


summing over Z. If f is zero outside a compact K, only the n € K count, 
whence |p(f)| < Mx(t)||f\|x where Mx (2) is the number of integers in K. 
More generally one may put 


(31.9) u(t) = > e(€) (8) 


assuming only that for every compact K C X the series taken over the € € K 
converges absolutely. Then, for f € L(X, kK), 


MALS Me(wllfllx where = Mx(u) = 5° |e(€) 


geek 


since the € ¢ K do not appear in (9). If the total series 5“ |c(€)| converges 
one obtains a bounded measure as in Example 2, for example 


u(f) = Son? sin(1/n)f(1/n) for X = [0,+00[, where one sums over the 
integers > 0. 


In all this, the fact that X is an interval of R hardly plays a rdle. One 
could assume that X is a locally compact subset of C (end of n° 23) — an open 
set, a closed set, or, the general case, the intersection of an open and of a 
closed®® set — and consider the vector space L(X) of continuous functions on 
X which are zero outside some compact subset of X, with its obvious vector 
subspaces L(X, kK). A Radon measure on X is then again a linear form on 
L(X) whose restriction to each L(X, K) satisfies a relation (5). 

The assumption that X is locally compact is needed to assure the exis- 
tence of “many” functions f € L(X). The proof rests on the following lemma: 


°° Exercise. Show from the definition that the intersection of two locally compact 
subsets of C is locally compact. 
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Lemma. Let X be a subset of C, K a compact subset of X and F a subset 
of X disjoint from K and closed in®° X. There exists a continuous function 
f on X such that 


(31.10) f=lonK, 0<f<lonX—-K, f=0onF. 
If X is locally compact one may assume f € L(X). 


Consider the function d(x, K); it is continuous, zero on K and > 0 outside 
kK. Let us show that there exists an r > 0 such that d(a,K) > r for every 
x € F. If not, there exist points 7, € F and y, € K such that d(an, yn) 
tends to 0. Since K is compact, one can (Bolzano-Weierstrass) assume that 
Yn tends to a limit b € kK C X; it is clear that then the x, € F tend to b. 
But since F is closed in X this implies b € FN K = 9, absurd. 

This done, we put f(x) = yld(x, K)] with a function y(t) defined and 
continuous for t > 0. For f to satisfy the conditions (10) it is enough that » 
should satisfy the following conditions: (i) y(0) = 1; (ii) 0 < y(t) < 1 for any 
t > 0; (iii) y(t) = 0 for t > r. The existence of such functions is clear. 

Now assume X locally compact. Since F is closed in X for every a € X—F 
there exists an open ball Bx(a) of X [the set of x € X such that d(a,x) <r] 
such that the corresponding closed ball Bx(a) [the set of z € X such that 
d(a,x) <r] does not meet F’. Since X is locally compact one may assume 
Bx(a) compact by choosing r small enough. Since K is also compact one 
may (Borel-Lebesgue) cover it with a finite number of balls Bx(ap) [these 
are the intersections of X with open balls of C]. The union U of these balls is 
open in X and contained in the compact union H of the Bx (ap), which does 
not meet F’. If one applies the lemma to K and X —U D X — FH one obtains 
a function f which satisfies (10) and is zero outside the compact subset H of 
X, qed. 

Exercise — Let X be the (not locally compact) union of the open disc 
|z| < 1 and of the interval [1,2] of R. Show that f(1) = 0 for every function 
f continuous in X and zero outside a compact K Cc X. 

Examples of measures on a locally compact subset of C are not always as 
obvious as in R. Certainly there are discrete measures like those of Example 4 
of n° 31. It is easy to obtain measures analogous to those of Examples 1 and 
2 if X is open: choose an arbitrary continuous function p on X (for we do 
not yet know how to integrate anything else) and put 


uf) =f] f(x,y) p(x, y)dady 


for every f € L(X); for every compact K Cc X the set X — K is open in C 
so that on agreeing to attribute to f(x, y)p(a, y) the value 0 outside X one 


6° This means that every limit in X of points of F is in F, or again, that F is the 
intersection of X and of a closed set in C. This is the general concept of closed 
set in the metric space obtained by endowing the set X with its usual distance. 
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defines a continuous function of compact support on C; the integral y(f) is 
then obtained by integrating over any compact rectangle A x B containing 
the support of f. 

If X =I x J, where J and J are arbitrary intervals of R, and if one has 
measures pp and vy on J and J, one can define a product measure 


fr f d(x) [ pey)arty) = f deta) f $00. ydux) 


as in the case where J and J are compact; to legitimise this construction one 
has to prove an analogue of Theorem 30, which is hardly necessary since f, 
being zero outside a compact subset of I x J, is zero outside a rectangle K x H 
where K C I and H C J are compact®!; the arguments of Theorem 30 are 
thus directly applicable here. Starting from these products of measures, you 
can choose a continuous function p on X = I x J and consider the linear 
form 


fro ff He,v)ole,vdu(asavy) 


as we did for Lebesgue measure. For example, take I =]0,+o0[ open, 
J = [0, +00] closed and 


wif) = f Hleldefe, v(t) = f F(o)e¥Pae 


on the square I x J C C, neither open nor closed in C but nevertheless locally 
compact, one obtains the measure 


Pererertrtrrerrrrry 


fro ff ta,y)dedy/ay'”. 


The integral is defined since the points of a compact subset of I x J cannot 
approach indefinitely close to the subset x = 0 of the frontier of J x J where 


5! Proof: the projections (a, y) + a and (x,y) +> y are continuous maps of C into 
R and, in particular, of J x J into J and J respectively; they therefore transform 
every compact subset of I x J into compacta K C I and H Cc J (Chap. III, n° 9, 
Theorem 11), so that the given compact set is contained in K x H. 
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the integration in x would diverge. In particular, a compact subset remains 
separated from the point (0,0), which does not belong to I x J. 

Exercise (less easy). Choose X = C and try to define a measure on X by 
the formula 


wit) = ff feu) (0? + 9?) days 


find the real values of s for which this is meaningful. 


The integration of semicontinuous functions in the case of a noncompact 
interval X, more generally of a locally compact subset of C, proceeds more 
or less as in the compact case. Dini’s Theorem shows that if an increasing 
sequence of continuous functions f, has a continuous function f for its upper 
envelope, then convergence is uniform on every compact K Cc X. As in the 
case where X is compact, one deduces that if f and the f, are in L(X) then 
w(f) = lim u( fn) = sup L( fn) for every positive measure ps on X. 

On replacing the f, by f, — fi, so replacing f by f — f, and pu(f,) by 
UW fn) — u(f1), we may assume the f,, positive. Since they are majorised by 
f they are all zero outside a compact K of X independent of n. Thus 


l“e(f) — uf) S Me (w)||f — fallx, 


which yields the result, clearly applicable to increasing philtres as in n° 10. 

To pass from this to the Isc functions we shall restrict ourselves to func- 
tions y which are positive outside a compact K C X for otherwise one cannot 
hope that L(X) contains an f < y. Being Isc, such a function is bounded 
below on the compact set K = {py < 0} [n° 10, (vi)], whence the existence of 
a fo € L(X) majorised by® . If one writes Lint(y) for the set of f € L(X) 
satisfying f < y one sees, as in n° 10, on considering y — fo which is Isc and 
positive, that y is the upper envelope of the f € Ling(y): in the case of an 
arbitrary locally compact X C C, the lemma above, applied taking for K the 
set {a} and for F' the set of x € X such that d(a,x) > 1r, replaces the figure 
of n° 10. Then one puts 


(31.11) u(y) = sup uf). 
f€Lins(¢) 


The crucial point, as in n° 10, is that one may calculate u*(y) using any 
increasing philtre ® C Lint(y) having y for upper envelope. First, the sup of 
the y(f) for f € is clearly < p*(y). Oppositely, for every h € Lint(y) the set 
of functions inf(f,h), where f € ®, is an increasing philtre (obvious) whose 
upper envelope is h (obvious); the version of Dini’s Theorem just obtained 
then shows that (hk) is the upper bound of the integrals of the inf(f, h); 


62 Let —m,m > 0, be the minimum of y on K, i.e. on X. By the lemma above 
there exists an f > 0 in L(X) which is equal to m on K. The function —f is the 
one we need. 
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since inf(f,h) < f we conclude that the upper bound of the u(f), f € &, 
majorises 4(h) for every h € Ling(y), so majorises y*(~), whence the result. 

From here, the machine turns on its own and yields the three properties 
(i), (ii) and (iii) of n° 11. No need to reproduce the proofs: you replace m by 
pe and Cing by Ling. 

One may also consider on X the usc functions w which are lower envelopes 
of continuous functions of compact support; this assumes these ~w negative 
outside a compact (so zero outside a compact if they are positive everywhere), 
and since they are bounded above on it, the existence in D(X) of functions 
which majorise it is obvious as in the other case. Writing Lsup)(w) for the set 
of f € L(X) which majorise w, one denotes by pu*(w) the lower bound of the 
u(f) when f runs through Lgup(w). Replacing ~ by —wW would bring us back 
to the previous case. 

These constructions apply in particular to the everywhere continuous 
functions on X. Such a function y is, in many ways, the difference of two 
positive continuous functions y’ and y”, for example yt and y~, to which 
the definition of the integral of an lsc function applies. It is therefore nat- 
ural to define u(y) = u(y’) — u(y”), but the definition has no meaning if 
u(y’) = u(y") = +00, and depends a priori on the choice of y’ and y”. To 
eliminate the first objection one limits oneself to integrable continuous func- 
tions for ys (understood: absolutely) for which one may choose y’ and y” to 
have finite integrals; since 0 < yt < y’ we have the same for y* and ~~, so 
that |y| = y* + ~ is also integrable. The second objection is not one: for 


yp =p" -! So" + =" 4+”, 


whence the result thanks to the additivity of the integral. 

In the case of Lebesgue measure on an interval X we recover the defini- 
tions of n° 22. We may restrict ourselves to the case of a positive continuous 
function y. N° 22 defines convergence of the integral [ y(x)dzx by insisting 
that the integrals on the compact sets kK C X be bounded above; here we 
assume that the integrals of the functions f € L(X) majorised by y are 
bounded above. Since each of these f is zero outside a compact of X it is 
clear that convergence in the sense of n° 22 implies convergence in the present 
sense. Moreover, it is clear, since y is continuous, that, for every compact in- 
terval kK C X, there exists an f € L(X) equal to y on K and < y everywhere 
elsewhere: multiply y by a continuous function of compact support with val- 
ues in [0, 1] and equal to 1 on A’. Whence the implication in the reverse sense, 
and the equality of the integrals of y on X defined by the two methods. 


32 — The Stieltjes construction 


Returning to the case of R, let us take an increasing function p(x) (in the 
wide sense) on X = (a,b) and show how, in generalising the definition of 
the usual integral, one may associate with it a measure on X which we shall 
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again denote by yu. One can show that in this way one obtains all the positive 
measures on X. The results of this n° are rarely useful and will not be used 
in this volume, so the reader may go directly on to the following n°; but the 
arguments brought into play are excellent exercises in analysis. 


(i) Definition of the measure. Instead of defining u(f) directly for every 
f € L(X) we shall first do so for the step functions which vanish outside 
a compact subset®? of X. As in n° 1, this is equivalent to attributing a 
“measure” to each bounded interval J = (u,v) such that [u,v] C X; this 
condition is superfluous if X is compact, but if X is open at one of its 
endpoints where the function (a) may tend to +00 or —oo, it avoids infinite 
measures. This said, one puts 


(32.1) u(ju,v[) = p(v—) — p(u+), 
u(ju,v]) = pvt) — p(u+), 
u([u,v[) = p(v—) — p(u-), 
u([u,v]) = pvt) — p(u-), 


agreeing that u(a—) = pu(a) at the left endpoint of X and (b+) = p(b) at 
the right endpoint when X contains a or b. Then p[(u,v)] = u(v) — u(u) 
if the function pz is continuous, but, as one sees immediately, the definition 
chosen in the general case permits point masses at the points where wp is 
discontinuous: the measure of a singleton interval [u, u] is equal to the jump 


(32.2) wut) — p(u-) = Apu) 


of the function p at this point. One may note in passing that these formulae 
involve only the right and left limit values of ju; one could for example assume 
p continuous on the right, replacing p(x) by (a+) for every x. 

The main merit of these definitions is that, if one has a partition of I = 
(u, v) into pairwise disjoint intervals ,...,J,, then 


(32.3) wD) = So w(Ip) 
One may actually assume that I, = (tp, p41) with weak inequalities 
U= X42 <...5 En41 =v 


to allow for the possibly singleton intervals. In calculating p(11) + w(J2), for 
example, two cases are possible; if x2 belongs to Jz, then it does not belong 
to I,, and the sum is equal to 


[H(@2—) — w(u?)| + [e(@s??) — w(@2—)] = w(@s??) — w(u?), 


63 The reader may assume X compact to start with, which simplifies the arguments 
a little. 
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where ? is the sign — or + as J contains x; = u or not, and where ?? is 
likewise + or — as Jy contains x3 or not. If, on the contrary, x2 does not 
belong to Jz, and so belongs to J, one finds 


[u(zo+) — w(u?)] + [u(e3??) — w(r2+)] = w(x3??) — w(u?). 


On pursuing these small calculations step-by-step one finds finally that the 
right hand side of (3) is equal to u(v??) — u(u?), ie. to u(L). 

This being so, one defines the integral of a step function y by the obvious 
formula: one chooses intervals I, C X, pairwise disjoint, finite in number, 
and with compact union K CX, such that y is constant on the J, and zero 
outside kK, and puts 


(32.4) u(y) = >) 9(Ep) (Ip) 


where €, € I, for every p. The additivity formula (3) guarantees as in n° 1 
that the integral depends only on vy and not on the partition chosen; it is then 
obvious that the map y+ p(y) is linear, that u(y) > 0 for every y > 0, and 
that it is continuous in the sense that if y is zero outside a compact interval 
K CX then 


(32.5) u(y)| < HR) I ell x 


since the J, are, or may be assumed to be all contained in K, so that 
Hp) < w(K). 

With these conventions, one may construct a Riemann integration theory 
as in n° 2 of this chapter. To define u(f) for an f € L(X, K) for instance, 
choose a sequence of step functions y,, which vanish outside K and con- 
verge uniformly to f; then u(f) = limpu(y,) exists (Cauchy’s criterion) and 
depends only on f. 

To proceed further, let us consider a finite partition of K into intervals 
I, sufficiently small for f to be constant to within r on each J, and let us 
does an ap in each Ip. Then || f — > f(#p)xz,||, <7, whence 


(32.6) )— So Up) f(@p)| < wCR)r. 


From here, the fact that the map f +> pu(f) of L(X, K) into C satisfies The- 
orem 1 is too obvious to deserve yet another model proof. 


(ii) History. The measures that we have just described were published 
in 1895 by the Dutchman Thomas Stieltjes (1856-1894), then professor at 
Toulouse, in a long memoir on certain analytic functions which he represents 


by a formula 
f(e)= [FO 


t—z 
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and also studies series and true integrals. In the Louis XIV typography® of 
the Annales de la Faculté des Sciences de Toulouse of the period, he devoted 
no more than two pages to explaining the construction of his integral with 
respect to a monotone function; although remarking on the analogy with a 
distribution of masses and the fact that the discontinuities of the function 
js correspond to discrete masses, Stieltjes restricts himself to saying that, 
to integrate a function f(x) over a bounded interval (a,b), one considers a 
subdivision a = 71 < 42 <<... < &n = b (strict inequalities) of it, chooses 
points €, such that tz» < &) < &p41 (weak inequalities) and one calculates 


the sum 
So (Ep) [m(ap41) — H(2p)] 


which, according to him, converges to the integral [ f(x)dy(2) as the subdi- 
vision becomes ever finer; he proves nothing, refrains from detailing the role 
of the discontinuities of 4, and says only that one argues as in the usual case, 
which is a little optimistic if one starts from his formula; after this he returns 
to his analytic functions. 

This first generalisation of the Riemann integral provoked no interest, 
notably not even on the part of Lebesgue who passed over it in silence in 
his 1903 book, where he expounds his own work. But in 1909, a Hungarian 
mathematician, Frigyes Riesz (1880-1956), one of the creators of functional 
analysis, showed in a note in the Comptes rendus de l’Académie des sciences 
de Paris that, if K is a compact interval, then every continuous linear form on 
L(K) is a difference of Stieltjes integrals f + f f(x)du(z) (ie. is defined by a 
not necessarily positive measure p1 °°); at the same period, Hilbert, Hellinger 
and Toeplitz, who began to generalise the classical “diagonalisation” in finite 
dimensions to the linear operators on a Hilbert ... space, proclaimed the use- 
fulness of Stieltjes integrals in their theory; we shall explain why in volume 
IV [Chap. XI, n° 22, (iv)]. As a result, Lebesgue remarked in the second edi- 
tion of his book, rather roundaboutly, that his theory extends to the Stieltjes 
integrals, which was incontestable but a little delayed. In 1913 Radon gen- 
eralised the Stieltjes integral to the case of several variables starting in R” 
from a function y(£) defined on reasonable subsets E of the space and sat- 
isfying additivity conditions analogous to those of Lebesgue measure®®; he 


64 Many French administrations still have a tendency to use it; it minimises the 
import of the text, i.e. the information made public. The comparison with official 
American documents, parliamentary reports for example, is edifying: several 
thousand pages of dense text every year for the discussion in committee of the 
defence budget, several dozen with wide margins in France. 

6 For a proof, see Walter Rudin, Real and Complex Analysis (McGraw Hill, 1966), 
Chap. 2. 

86 This is the method that one finds in Hans Grauert and Ingo Lieb, Differential- 
und Integralrechnung III (Springer, 1968), Chap. I. Playing with half open and 
semi closed parallelipipeds is not exactly relaxing. It is easier in Rudin, but since 
he first devotes a whole chapter to “abstract” measures which he never uses in 
his book, the energy expended in pure loss is about the same. 
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shows how to integrate with respect to such a measure. One could then easily 
liberate oneself from the hypothesis that X C R” (Maurice Fréchet, 1915), 
after which the general theory would occupy a generation of mathematicians, 
not to say two®”, which vied to generalise them, even though the results are 
not always of great use, or put to work principles very different from those 
of Lebesgue or Radon. 


(iii) The increasing function associated with a discrete measure. Consider 
the discrete measure (31.9) of the preceding n°, assuming that the c(£), € € D, 
are positive so as to obtain a positive measure. To obtain the increasing 
function u(a) which, according to F. Riesz (there was also a Marcel Riesz, 
his brother, a first class analyst and great lover of spirits, at Lund, while 
Frigyes served his whole career in Hungary) defines it, let us choose once and 
for all ac € X (it is simplest to choose c= a if a € X) and put® 


£) = ecloa] c(€) if c<a, 
a =) Dela c(€) if xw<e. 


The series is the partial sum of an unconditionally convergent series, so con- 
verges. It is an increasing function since, as x increases, the sum (7) contains 
more and more positive terms if 7 > c and fewer and fewer negative terms if 
x <c. It is even right continuous. For consider two points x and «+h > ax 
and assume first that c < x; now u(x +h) — u(x) is the sum of the masses 
c(€) contained in the interval ]z,x + h]; since the series }> c(£) converges 
unconditionally there is, for every r > 0, a finite subset F' of DO|az,x + hl] 
such that the sum of the c(€) for € ¢ F is < r; the elements of F being finite 
in number, |2,x + h] contains no element of F if h is small enough. The dif- 
ference u(x +h) — (2) is thus < r for h > 0 small enough, whence the right 
continuity of yz in this case. If x < «+h <c, the difference p(x + h) — p(x) 
is again the sum of the masses contained in |x,x +h] and the argument is 
the same. Note in passing the importance of not confusing the signs [ and ], 
the classical pitfall in this subject ... 

To calculate (a—), note that for h > 0, u(x) — u(a — h) is the sum of 
the masses contained in Ja — h,z], a sum which always contains the mass 
c(x), possibly zero, placed at the point x and which, for h small enough, is 
arbitrarily close as above. In other words, (a) — u(a—) = c(x), whence 


(32.8) w+) = wa) = ula) + el), 


87 The book of Jean-Paul Pier, Histoire de l’intégration (Masson, 1996) contains 
a very rich and useful bibliography, but cites numerous not always illuminating 
commentaries, especially when written in the “academic eulogy” style; no one 
ever needed Darboux to know that Riemann was a great mathematician (and, 
moreover, not on account of his integral ...). 

Compare with the function (13.1) used in extending the FT to the primitives of 
regulated functions, in which the weak inequality is replaced by a strict inequal- 
ity. 


(32.7) 


68 
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so that (a) has a discontinuity of amplitude Ay(€) = c(€) at each point 
€ € D and is continuous at the other points of I. 

(Since the € might well be all the rational points of X, this shows in 
passing that the increasing functions are not just those suggested by the 
usual naive sketches.) 

This done, it remains to apply the definitions (1) and to check that one 
recovers the formula 


(32.9) w(d) = Ye(8) 
éel 
for every bounded interval J = (u,v) such that [u,v] C X. 

Next we have to check that the integral of a function f € L(X, K) is 
well defined by formula (31.9) of the preceding n°. For this, let us consider 
a partition of K into intervals J, on which f is constant to within r and 
apply the approximation (6); the only intervals which count are those which 
contain points of D and (9) shows that 


uf) = S— f(a) S> el) 


fel 


to within w(c)r. But for € € J, one has f(a,) = f(€) to within r; on replacing 
f (tp) by f(€) for all the € € I, in the preceding formula, one commits, for 
each p, an error less than r times the sum of the c(£) for € € I,, so a total 
error less than p(i)r. Since f is zero outside K we have 


(32.10) uf) = >> cf) FE 


€e€D 


to within 2u(K’)r, whence the equality since r > 0 is arbitrary. 


(iv) The discrete and continuous components of a measure. The sums (9) 
appear again in the general case of an arbitrary increasing function pp when 
one wants to make the réle of the discontinuities of y in calculating p(f) 
more precise. Furthermore we shall see that as well as a “continuous sum”, 
as in the classical case, (f) includes a “discrete sum”, namely the sum 


(32.11) palf) = >> An) F(6) 


extended over all the discontinuities of the increasing function ju; recall that 


Ap(é) = w(E+) — u(€—-). 


First note that the set D of these points of discontinuity is countable® 


and that 
So Aulé) < WK) 
eK 


6° This also results from the fact that a monotone function is regulated. 
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for every compact K = [u,v] C X. To see this, one first remarks that the 
partial sum extended over a finite subset F of DN K is < p(k); for if one 
orders the points £; <...< €, of F' one has 


w(Er—) < w(Eit+) <... < W(En—-) < WEn+); 


the sum in question is thus majorised by (+) — u(€1—) and so by u(v+) — 
p(u—) = w(K) since p(x) is increasing. For every integer n > 1, the € € K 
where u(€) > 1/n are therefore finite in number, whence simultaneously 
the denumerability of DM K, so of D since X is the union of a sequence 
of compact sets; hence the desired inequality, which makes the series (11) 
absolutely convergent for every bounded function f vanishing outside K, so 
for f € L(X,K). 

Now the expression (11) reenters the preceding framework (iii). One is 
thus led to associate to the expression (11), a discrete measure on X, the 
corresponding increasing function (7), here with c(€) = Ayu(é). Now assume 
u(x) right continuous, which we may, as we have seen above, and put 


(32.12) M(x) = Pax) + Be(2); 


we thus define a continuous function te since 4 and fq have the same points 
of discontinuity € € D, are both right continuous, and have the same “jumps” 
at these points. The function yu. is moreover increasing. To see this, we have 
to check that, for x < y, we have u(x) — pa(x) < u(y) — wa(y), ie. 


Maly) — Ha(z) < ely) — W(x) = w(yt+) — W(@+), 


which, by (7), means that the sum of the jumps of the function p at the 
points of the interval |, y] where it is discontinuous is less than its variation 
between «x and y, which is clear as we saw in proving (8). 

This done, the functions p(x), a(x) and j-(x) define Radon measures on 
X and it is obvious — use Riemann sums — that, for every continuous function 
f € L(X), one has 


(32.13) uf) = bal f) + Hel f) = Me(f) + S> An(O)F (6). 


Since the function ju.(x) is continuous, the formulae (1) simplify: 


(32.14) Mc(I) = fe(v) — Me(u) if I = (u,v), 


whether J is open, or closed, or ..., and it is vain, in the “Riemann sums” 
for fe, to allow for the nonexisting discontinuities of u(x). The measure fig 
provides the “discrete sum” and the measure pz, the “continuous sum” to 
which we alluded above. 

The ideal case is where the function pz-(x) is of class C1. The mean value 
theorem then shows that if I = (u,v) 
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fe(L) = we(w)(v — u) = we (w)m(Z) 


for aw € I, where m(J) is Euclidean length. The Riemann sum 97 (Lp) f (tp) 
relative to a sufficiently fine partition of K, and in which x, can be assumed 
equal to w,, becomes >> f(w,) ui (wp)m(I,); on decorating this calculation 
with the inevitable ¢ and 6 and calling ~ what we would call j,, one then 
finds that for every increasing function js of class C! (so continuous), one 
has 


(32.15) / f(e)du(x) = / H(a)u (adder; 


in 1697, but not in 1997, Leibniz would have said to you: obvious since 

du(x) = p'(x)dx ... But here like elsewhere, it is not the notation which 

makes the formula obvious; it the formula that explains the notation. 
Consider for example on X =]0,+00[ the measure 


fro f fa)de/z 


where one integrates over X. It corresponds to the monotone function p(x) = 
log x: this is of class C' and has derivative 1/x, whence the result by (15). 
For this measure, the measure of an interval I = (u,v) with 0 < u<v < +00 
is uw(v) — u(u) = log(v/u). One could extend this argument to any positive 
continuous function p other than 1/z: the increasing function defining the 
Radon measure 


fro f saypladr 


is any primitive P(x) of p; the measure P(v) — P(u) of an interval I is the 
integral of p extended over I. 

Exercise. Prove (15) assuming that u(x) is a primitive of a regulated 
function > 0. 

Finally we remark that there are much more complicated monotone func- 
tions than the preceding, and for which the nondiscrete component of the 
corresponding integral f f(a)dp(x) is not of the form f f(«)p(x)da, even if 
you permit “densities” p(x) that are Lebesgue integrable on every compact 
set (“singular” measures concentrated on sets of measure zero). The general 
theory of integration treats them exactly like the others. 


33 — Application to double integrals 


As we saw at the end of n° 9 and in n° 30, one may integrate any continuous 
function f(x,y) on a rectangle X x Y with respect to Lebesgue measure dxdy 
or, more generally, with respect to a product measure du(x)dv(y), where X 
and Y are compact intervals, and even if X and Y are noncompact, provided 
that f € L(X xY). But to integrate over an arbitrary compact set K C XxY 
—in other words, and by definition, to integrate the function equal to f on K 
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and to 0 elsewhere over X x Y — is not as simple, unless one imposes ad hoc 
hypotheses on K as the physicists and engineers do, or, like a mathematician 
of the last century, declares that “the possibility of inverting the integrations 
rests on the obvious principle that a sum remains the same for any order in 
which one adds the parts’°”. The end of n° 9 has demonstrated the difficulty 
of the general problem. 

The first problem is to define an integral over a compact set K C X x Y, 
in other words to integrate the function y equal to f on K and to 0 at the 
other points of X x Y. Most luckily, it is upper semicontinuous if f is positive. 
For, consider a point a € X x Y. Ifa ¢ K, then y(x) = 0 on a neighbourhood 
of ain X x Y since K is closed, whence the continuity of y at a. At a point 
a € K, since f is continuous on K, for every r > 0 there exists an open disc 
B(a) such that v(x) = f(x) < f(a) +r on KN B(a); since v(x) = 0 for x ¢ K 
and f(a) > 0, we have p(x) < f(a) +r = y(a)+r for every x € B(a), whence 
the result. If f is negative, the function y is on the contrary Isc since it is the 
negative of the usc function constructed starting from —/f. If f changes sign, 
catastrophe: we do not know how to integrate a function which is neither lsc 
nor usc. But one may always write f = ft — f~ and deal with f* and f-, 
for lack of a less crude method. 

In these circumstances, as much to generalise and establish the standard 
result: 


Theorem 31. Let X and Y be two intervals, w and v positive measures 
on X and Y, and A the product measure on X x Y. Let py be an Isc (resp. 
usc) function on X x Y which is’! the upper (resp. lower) envelope of the 
f € L(X xY) which minorise (resp. majorise) it. Then we have the following 
properties: 


(i) For every x € X, the function y > v(x, y) is Isc (resp. usc); 
(ti) the function 


(33.1) or | oeu)arty). 


with values > —co (resp. < +00), is Isc (resp. usc); 


(itt) we have 


(33.2) Mo) = f dua) f ele. s)dotu) 
the two members being simultaneously finite or infinite; 


0 Joseph Bertrand in his Traité de calcul différentiel et intégral (1870), cited by 
Jean-Paul Pier, Histoire de l’intégration (Masson, 1997), p. 104. Bertrand taught 
analysis at the Ecole polytechnique from 1856 to 1894, which indicates the ca- 
pacity for renewal of the institution at this period. There also, in parallel, were 
Charles Hermite (1869-1876) and Camille Jordan (1876-1911), who introduced 
some notions of set theory before 1900. 

a condition always satisfied if X and Y are compact or if y is positive (resp. 
negative), etc. 


rel 
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(iv) we may invert the order of the integrations in (2). 


We shall examine the case of an usc function, the other being trivial 
to deduce from it: multiply the function by —1. As always, the two crucial 
points will be that (a) a lower envelope of continuous functions is usc; (b) one 
may calculate the integral of an usc function from any decreasing philtre of 
continuous functions having y as lower envelope. It remains to combine these 
tools with the definitions; there is nothing more to the very short proof. 

That the function y+ y(z,y) = Y2(y) is usc on Y for every x € X may 
be seen either by using the definition in terms of inequalities, or from the 
fact that » is the lower envelope of the set ® = Lgup(y) of the functions 
f € L(X x Y) which majorise it. 

One may therefore integrate yz, the result being < +00, though it can 
be —oo for an usc function. For x given, it is clear that the set of functions 
fel(y) = f(x,y), where f € ®, is a decreasing philtre of continuous functions 
on Y whose lower envelope is y,. Thus 


(33.3) V(Yr) = inf (fr). 


Let us now put Fy(x) = (fx) = f f(x,y)dv(y) and Fy(x) = v(pz). Since 
the f vary in a decreasing philtre, it is the same for the Fr, because 


f<¢= fe = = Fy @) <= (2) 


since v is positive. Now the functions Fy(«) = v(f,) are continuous on X 
because, f being in L(X x Y), the f, are zero outside the same compact 
subset of Y, which allows us to argue as in n° 9, Theorem 9. Since 


Fy(z) = V(x) = inf v( fr) 7 inf F(z), 
one concludes both that Fy, is usc — point (ii) of the statement — and that 
(33.4) w(F) = inf w(Fy), 


But (4) can again be written 


(33.5) [ dy(e) f 6c, ydo(y) =int f dyte) [ f(e,y)do(y) = int ACN) 


where the inf is relative to the f € ®. Since, in (5), f runs through the set of 
f € L(X xY) which majorise y, the right hand side of (5) is, by definition of 
the integral of an usc function, equal to the integral A(y) of y with respect 
to ». The relation (5) therefore establishes (2). Point (iv) of the statement is 
then obvious, qed. 

To complete the enunciation of Theorem 31, assume that the function y 
is Isc, positive and integrable on X x Y, i-e., that A(y) < +00. By (2) we 
then have 
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[ eute) [ole srdvty) < +201 


adopting the notation of the proof, this means that the function F(x) = 
J y(x, y)dv(y), which is lsc and positive, has finite integral and so is integrable 
with respect to yp. As we have seen (n° 11, Exercise) for the case of Lebesgue 
measure on a compact interval — but it generalises immediately — we can 
deduce that Fy, is finite outside a set N of j-measure zero, and therefore the 
function y +— (2, y), which is lsc, is integrable with respect to v for every 
cEX—N. 

In the case we started from at the beginning of this n°, where y is equal 
to a continuous positive function f on a compact set kK C X x Y and to 0 
outside K, which ought, by definition, lead to the integral of f on K, it is 
helpful to make (2) a little more explicit, where everything is now finite since 
y is bounded above and below. Now, to integrate y(x, y) with respect to y 
for x given is equivalent to integrating over Y the function equal to f(z, y) 
if (w,y) € K and to 0 elsewhere. If one denotes by K(x) the compact set of 
the y € Y such that (x,y) € K — the “section” of K by the vertical through 
the point z — the number f (x, y)dy is then obtained by integrating over Y 
the function equal to f(x,y) for y € K(x) and to 0 elsewhere, and this is usc 
(same argument as in X x Y). It is natural to denote this integral by 


(33.6) f (2, y)dy; 


the Lebesgue-Fubini formula is then written 


(33.7) a Flesw)dedy= f def Fe.vddu 


according to tradition. But as we have already observed at the end of n° 9 the 
sets K(x) can be arbitrary compact sets in Y, so that, as intuitive as it may 
seem, formula (7) masks an integration theory already much more advanced 
than that of Riemann. And we have had to assume f positive to establish it! 
Of course, this restriction, which may seem ridiculous, will be eliminated by 
the complete Lebesgue theory (Chap. XI, § 4, n° 10). 
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§ 10. Schwartz distributions 


34 — Definition and examples 


We showed in the preceding § how one may introduce the concept of the 
integral, classical or not, into a particularly simple general framework: that 
of linear continuous forms on a vector space endowed with a “topology” . 

In analysis there is another operation bearing on functions and possessing 
analogous linearity and continuity properties: derivation. This is not defined 
for every continuous function but it is easy to introduce it into the present 
framework. Consider for simplicity a compact interval kK C R and the vector 
space C!(K) of the functions of class C' in K, endpoints included. Choose 
measures js and v in K and, for every f € C'(K), put 


(34.1) r= f Herdn(o)+ | Fear. 
It is clear that f + T(f) is linear and that 


(34.2) |T(f)| < M@w)\Ifllk + MMF lle <M (fla + | f'llx) 
where M is a constant independent of f, whence, for any f,g € C'(K), 


IT(f) — T(9)| < M (If - gll« +f! -g'Ilx)- 


If a sequence of functions f, € C'(K) converges uniformly to a limit f and if 
the f!, converge uniformly to a limit g, in which case f € C!(K) and g= f’ 
(Theorem 19 of Chap. III, n° 17), then T(f) = lim T(f,.). 

These results can be interpreted on defining a norm on C!(K) by the 
formula 


UUM = MFI + FT 


where the norms on the right hand side are the uniform norms on K. The 
relation (2) shows that T is a continuous linear form on C? for this norm. 
Theorem 19 of Chap. III, on the other hand, shows that Cauchy’s criterion 
holds in C* (4): if indeed one has ||| fp — fq||| <7 for p and q large, then also 
Ifo — fall <r and || ff — f%|| <r; consequently, the f, and their derivatives 
converge uniformly to functions f and g and it is clear, as above, that f 
is the limit of the f, in the sense that lim |||f — fp||| = 0. In other words, 
C!(K) is a complete normed vector space — in short, a Banach space. It can 
be shown that there are no continuous linear forms on C1(K) other than the 
expressions of the form (1). 

This kind of analogy between measures and derivations directly inspired 
Laurent Schwartz, the inventor of the theory of distributions. In the period 
when he created his theory one already knew of similar, but limited, attempts, 
due, for example, to the German Salomon Bochner who emigrated to the USA 
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before 1940, and much more to the Soviet Sergei Sobolev’ in his work on 
partial differential equations. But it was Schwartz, during and after W.W. IJ, 
who understood the enormous generality of the concept of distribution and 
formulated it in a perfectly clear way, placing it in the framework of the 
theory of topological vector spaces”?. 

To obtain a satisfactory theory it is clearly necessary to allow for deriva- 
tions of any order. One therefore has to replace the C! functions envisaged 
above by C’°° functions; when one works on R, to which we confine ourselves 
in this §, we have, as in integration theory, to restrict ourselves to C'° func- 
tions of compact support, i.e. zero for |x| large: this is the vector space D 
or D(R) we have already used in n° 27 to “regularise” i.e. make C™, with 
the help of convolution products, functions which are not. For Schwartz, a 
distribution on R is a linear form on D, i.e. a map T of D into C such that” 


Tay + By) = aT (vy) + BT() 


and satisfying a continuity condition analogous to that imposed on measures 
in n° 31, but distinctly less obvious. 

In the first place, one wants the measures to be particular distributions. 
This indicates that a distribution cannot have a continuity property unless 
one restricts to the vector subspace D(R, K) = D(K) of the functions which 
vanish outside a given compact subset K of R. 

Since every y € D, and all its successive derivatives, are continuous and so 
bounded (compact support), for every integer r > 0 one may define a norm 


(34.3) ell = lel + lle +--+ le | 
on D, where, as in all the rest of this §, the notation 


Ilell = sup |e()| = Ilelle 


denotes the norm of uniform convergence on R; definition (3) directly gen- 
eralises the norm |||y|||, which we introduced temporarily above, on C!(K). 
Clearly 

lot BIO < [el + lel, 
so that the expression d,.(y,w) = ||v — y||"" satisfies the triangle inequal- 
ity; to say that d,(y,w) < © implies that, for every h < r, one has 
|v (x) — bp (x)| < € for every x € R. Also 


 §. Bochner Vorlesungen tiber Fouriersche Integrale (Leipzig, Akademie Ver- 
lagsgesellschaft, 1932), S. Sobolev Méthode nouvelle a@ résoudre le probléme de 
Cauchy... (Mat. Sbornik, 1936, 1(43), pp. 39-71). According to a recent Russian 
book on the history of Soviet nuclear weapons, Sobolev was in charge of the 
mathematical and computational part of the project in 1943-1953. (Courtesy of 
Jean-Marie Kantor.) 

"3 Schwartz describes his discovery in detail in Chapter 6 of his memoirs, Un 
mathématicien aux prises avec le siécle (Paris, Odile Jacob, 1997) (trans. 
A Mathematician Grappling with his Century, Birkhauser, 2001). 

“4 Tt has been standard usage since Schwartz to denote the elements of D by Greek 
letters and to denote “arbitrary” functions by Roman letters. 
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(34.4) [ell < [er 


for any r and y, and ||cy||( = |e|.||y|| for every constant ¢ € C. 

The introduction of these norms ||y||‘") on D allows one to verify the 
continuity of linear functions T(y) much more general than (1): choose p+ 1 
measures jt; (0 <7 <p) on R and put 


(345) 1) =D [ edule) = Tl), 


Then 
Ile |] < oll 


for every 7 < p and consequently 
(34.6) [T(~)|< Mx(T)lell® — where Mx(T) = S) Mx (ui) 


by definition (31.5) of a measure. 

One might wonder whether it is possible to construct more sophisticated 
linear forms on D, involving, maybe, infinitely many derivatives of y. This is 
sometimes possible: 


T(y) = > on), 


the series taken over N. There is no problem of convergence since, for a 
function zero outside a compact set, all the terms of large enough rank are 
zero; and in a given subspace D(‘) the preceding formula is just a particular 
case of (5), with Dirac measures at the points n situated in K. Another 
attempt: put 


T(p) = So eny (0) 


with the coefficients c,, chosen to make the series converge for any y. Now 
the derivatives at a point of a y € D can be chosen arbitrarily (n° 29); the 
series )> Cyd, would therefore have to converge for any a, € C, for example 
if dn = 1/cen; absurd. 

These remarks and, of course, a now more extensive experience, show that, 
in a given subspace D( Kk), one should not hope to go further than expressions 
of the form (5); in fact, one of the first theorems proved by Schwartz — and 
very easy thanks to the theory of Banach spaces’? — was that, in his theory, 
every distribution reduces, on a compact K, to the form (5), with measures 
pt; depending on Kk. 

This indicates that the continuity condition to impose on a distribution T’ 
is the following: for every compact K C R there exist a p € N and a constant 
Mx(T) such that 


*® The corresponding theorem for distributions on T (C® functions of period 1) 
can be proved elementarily with the help of Fourier series, as we shall see in 
Chap. VII, end of n° 10. 


§ 10. Schwartz distributions 171 


(34.7) IT(¢)| < Mx(Z).\|g|™ for every v € D(K). 


Why is this a “continuity” property ? Because it is natural to define the 
notion of convergence of a sequence on each D(K) in the following way: 


a sequence of functions y, € D(A) converges to a yp € D(K) if for 
every r € N the sequence of derivatives yh") converges uniformly 
to yp), 


This definition is inspired by Theorem 19 of Chap. III, n° 17 and presents 
the advantage that, as in C'(K), there is an analogue of Cauchy’s criterion 
for this concept of convergence: to verify that a sequence of functions y, € D 
converges in the preceding sense it suffices (and is necessary) to check that 


Ile? — gl) | <e for p, q large, 


for every r € N and every € > 0; the y, and all their successive derivatives 
then converging uniformly to functions vanishing outside K, the theorem in 
question assures that the limit y of the y, is C® and that yp) converges 
uniformly to vy”) for any r; in other words, the sequence yp converges to y 
in D(K) in the preceding sense. 

This said, it is clear that this concept of convergence means that the 
uniform distance ||”) — yl) || tends to 0 for any r; it is clearly equivalent to 
requiring 


(34.8) lim ||y — Ynll =0 for every r. 


The condition (4) imposed on the distributions then implies that, if a 
sequence y,, € D(K) converges to a limit y € D(K) in the preceding sense, 
one has 

T(y) = limT (yn). 


Conversely, one may prove — it is not totally obvious — that this property 
forces the existence of a majorisation (7) for every compact K CR. 

One should pay attention to the fact that to formulate this continuity 
condition one must work in the vector subspaces D(/‘), failing which one 
would restrict considerably the definition of distributions as in integration 
theory. The distribution — in fact, a measure — T(y) = > y(p), where one 
sums over Z, provides a counterexample: take for y, a C° function with 
values in [0, 1/n], vanishing outside [n—1, 2n], and equal to 1/n on [n, 2n—1]; 
the yn converge to 0 uniformly on R, but T(y,) = 1 for every n. 


Example 1. Every absolutely integrable function f on every compact inter- 
val of R (for example log || despite its singularity at the origin) defines a 
distribution” which is in fact a measure 


76 Tn all the rest of this chapter, the f sign will denote an integral extended over 
all R. 
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Ty(p) = f ela) flea. 


More generally, it is clear that every measure, restricted to D(R) C L(R), is 
a distribution. 


Example 2. Choose an a € R, an integer k € N and put 


T(v) = p™(a). 


For k = 0 one obtains the Dirac measure at the point a, denoted by 5, or €q: 


dal) a (a). 


One could consider more generally distributions of the form 


T(y) = YS aie (Gn) 


with a finite number of points a,, given constants c,, and arbitrary orders 
of differentiation k,,, for example 


T(p) = 9(0) + 8¢"(4) — e" (x). 


One may even allow a countable infinity of points az, subject to a few pre- 
cautions. A formula such as 


T(y) = >) ony) (n), 


where one sums over the n € Z, with a k independent of n, defines a distrib- 
ution because, for every compact K, the “series” in reality reduces to a finite 
sum for the y vanishing outside kK. More subtly, let us choose an arbitrary 
sequence of points a, € R, also cy, such that S*|c,| < -+-oo, and orders of 
differentiation k,, all less than the same integer k and let us put 


T(~) = rene) (an). 


Since the derivatives involved are all of order < k, the general term of the 
series is, in modulus, less than |cp|.||y||“, whence |T(y)| < M||y|| where 
M => |cy|, which proves continuity. 


These elementary examples and the formula (5) are enough to show how 
the theory of distributions allows one to unify the differential and integral 
calculus. 

Exercise. Given an interval X C R write D(X) for the set of functions 
defined on X, indefinitely differentiable (including at the endpoints of X 
if they belong to X), and zero outside some compact subset of X. Find a 
reasonable definition for distributions on X. Find a distribution on X =]0, 1] 
which is not the restriction to the y € D(X) of a distribution on R. 
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35 — Derivatives of a distribution 


One of the conjuring tricks (it is nothing else, despite its usefulness, and it 
is already in Sobolev) which the concept of distribution allows is to attribute 
derivatives to functions which do not have them. To understand this, let us 
start from a function f of class C! on R and consider the distribution 


Tr(y) = f ola)f'(wdz 


associated to its derivative. In view of the fact that y(x) = 0 for |x| large, an 
integration by parts shows that 


If, as above, one associates the distribution 
(35.1) Trig f o(a)fa)de 
to any f € C'(R), then 


(35.2) Ty (~) = -Ty(¢’). 
Starting from this, one defines the derivative T’ of a distribution T by putting 


(35.3) T'(y) = -T(¢’) for every yp € D. 


Since obviously 
[eI < [elo 


for every r, the continuity of T propagates immediately to T’. One may then 
repeat this operation and define the successive derivatives of T, clearly given 
by 


(35.4) T)(y) = (-1)?T(p™). 


If then you associate to each regulated function f on R the distribution Ty 
given by (1), you can define the “derivative” of f ... But this is no longer a 
function in the usual sense — miracles exist even less in Nature than infinitely 
small numbers ... —, it is a distribution which can be fearfully complicated 
and which, for this reason, one refrains in general from calculating explicitly. 
The difficulty does not appear if f is C'°; in this case, one may apply the 
traditional formula for integration by parts ad libitum and obtain the formula 


(35.5) (T7)™ = Tyo, 


which shows that the definition of the successive derivatives of a distribution 
is compatible with that of the successive derivatives of a C'°° function. 
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Example 1. Take for T the Dirac measure 


at the origin. We find T’(y) = —y’(0) in accordance with Dirac’s baroque 
formula to which we alluded at the beginning of n° 27. One might, like Dirac 
himself, continue: 


T'(y)=+9"(0), Ty) = -'"(0),_ ete. 


One may ask why it was twenty years before a mathematician justified this 
“obvious” calculation. In Dirac’s time, some theoretical physicists had begun 
to understand what was already called a functional space, i.e. a vector space 
of infinite dimension whose elements are, not the usual vectors in Euclidean 
or relativistic space or in the configuration space of a system of particles, 
but functions of one or several real variables; and in which one has a concept 
of “convergence” coming from a “norm”. From 1930 on, it was clear that 
the probabilistic interpretation of the Schrédinger equation of quantum me- 
chanics associates with every system of physical particles a square integrable 
function (in the sense of the Lebesgue theory) on a Cartesian space E of 
finite dimension whose points correspond to all the possible configurations of 
the system; when one integrates the square of this function over a subset M 
of the space of possible configurations one obtains the probability that the 
configuration of the system at the instant considered is one of those in M. 
This is what the mathematicians already called a Hilbert space, an infinite 
dimensional generalisation of Euclidean spaces endowed with a scalar prod- 
uct”” (see the Appendix to Chap. III, n° 5, and the chapter on Fourier series). 
But the right functional space, which would have allowed one to understand 
Dirac’s acrobatics, namely Schwartz’ space D, had not yet been invented, 
either because no one had thought of it, or, more probably, because no one 


"7 A very important part of quantum mechanics was invented by physicists working 
either permanently or temporarily at G6éttingen or nearby, or regularly appear- 
ing as participants at the international meetings that took place in Copenhagen, 
Cambridge, Miinich, Hamburg, Ziirich, but not in France — the “travelling semi- 
nar” of the physicists of the period, where many Americans of the Second World 
War learned their trade; see Donald Fleming and Bernard Bailyn, The Jntellec- 
tual Migration (Harvard UP, 1969), Daniel J. Kevles, The Physicists (MIT Press, 
1971) or Richard Rhodes, The Making of the Atomic Bomb (Simon & Schuster, 
1986, 886 pp.), who explains the system and where you will find much other 
information. It happens that Hilbert and other well-informed mathematicians in 
the latest progress in “modern” mathematics were also professors at Gottingen or 
nearby; the famous Methoden der Mathematischen Physik of Courant and Hilbert 
appeared at this time, the “abstract” theory of Hilbert spaces was constructed in 
1927-1930 by von Neumann at Géttingen and Hamburg, and was integrated into 
quantum mechanics in his Grundlagen der Quantenmechanik (Springer, 1929). 
Von Neumann was in the USA from 1933, like Richard Courant who founded 
an institute of applied mathematics at the University of New York, which, after 
1945, prospered in the regular American way of the time: military contracts. 
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had had the idea of considering vector spaces in which convergence is defined 
not by a single distance, as in the Hilbert or Banach spaces already known 
(the second ones only by few mathematicians), but by infinitely many func- 
tions d,(f,g) as in Schwartz’ space D. To understand this kind of situation 
it would have been necessary simultaneously to master integration theory, 
partial differential equations in several variables to have really interesting 
examples, the “abstract” algebra linear then in development, and the gen- 
eral theory of topological vector spaces where one is given in advance a family 
of “distances” that determine convergence. 

This was much too much for the physicists and even for the immense ma- 
jority of the mathematicians of the time, the main exception being possibly 
Sobolev in the 1930s. The theory of partial differential equations was in an al- 
most total state of chaos. That of “locally convex topological vector spaces” , 
a natural extension of the work of Stephan Banach and of the Polish school 
between the two Wars, was not invented until during the War by George 
W. Mackey in the USA and, independently, by Dieudonné and Schwartz a 
little later’®, then powerfully developed by Alexandre Grothendieck; so it 
was constructed at the same time as the distributions were, and under their 
influence. Afterwards Schwartz’ theory spread everywhere, including in the 
USSR where I. M. Gelfand and his school published several volumes on the 
subject filled with examples, until it became a fundamental tool in the theory 
of partial differential equations; see for example the formidable volumes of 
Lars Hormander, one of the principal proponents of the theory. Even more 
extraordinary, the general theory of distributions itself, which brought the 
first French Fields Medal to Schwartz in 1950, contained no truly “profound” 
theorem — not, though, its applications — and required “only” the ability to 
detect analogies between a dozen disparate domains and to isolate the general 
principle which unified all. The philosophers of science call this a paradigm, 
a new vision which not only puts order and clarity into chaos, but also and 
above all allows one to pose new problems. Universal gravity, the analysis of 
Newton and Leibniz, the atomic theory in chemistry, the theories of evolu- 
tion of Darwin, of heredity of Mendel, the bacteria of Pasteur, relativity and 
quantum mechanics, etc. 


Example 2. Take for f the function equal to 1 for x > 0 and to 0 for z < 0. 
Then 


+00 
T+(y) = i, p(e)de, 


whence 


8 Tn 1943-1945 one did not yet have the Internet and, in 1946, while my thesis was 
almost complete, I discovered it, and in Russian, in a Soviet article of 1943 which 
had just arrived in paris after an inexplicable delay. Being naturally curious, and 
its authors being not entirely unknown to me, thanks to some work from before 
1940, I had the good idea of reading it (i.e., then, of having it translated). 
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+00 
Ti() = -T;(¢’) = — | io («dx = (0) 


since the primitive y of y’ is zero for x large. In other words, the derivative 
of the distribution associated to f is the Dirac measure at the origin, and not 
a function in the usual sense. Obvious extension to the distributions defined 
by a step function; its derivatives are linear combinations of Dirac measures 
at the points where the function is discontinuous. For an arbitrary regulated 
function f, the uniform limit on every compact of a sequence of step functions 
fn, one has 


Ty(g) = THe) =— f fa)e"(w)de = —tim f fa(e)e@)de 
since in fact one integrates over a compact interval, whence 
Tp (yg) = lim Ty: (¢), 
a result practically impossible to explain in the general case. 


Example 8. Consider the distribution 


+00 
T(y) = 7 ple) log 2.de; 


the integral converges absolutely on a neighbourhood of 0 (no problem at in- 
finity) since |y(z) log z| < ||y||.| log z|. To calculate T’ one integrates naively 
by parts (if necessary passing to the limit on the interval [u, +-oo[ as u > 0+): 


+oo too +oo 
-T'(y) = i y (x) log x.dx = v(x) log a - J p(x)a dx 
0 0 
and one obtains infinite expressions. This shows that it would be better 
to use a primitive of w which vanishes at the origin in order to neutralise 
the logarithm; simplest would be to choose y(a) — y(0), which is O(z), but 
then the difficulty reemerges at infinity because of the term y(0) log x in the 
integrated-out part and of the term ¢(0)/z in the last integral”. To cut the 
Gordian knot one divides the integral in two; first 


1 


0 


[ e@rerde = [o(e)-yo))loge|, — f tole) - e(0)]de/2 = 
0 0) 


a2 | [(a) — y(0)]dx/e 


since the integrated-out part is zero by y(x) — y(0) = O(a) and log1 = 0. 
The integral obtained converges since [y(x) — y(0)|/x is bounded on a neigh- 
bourhood of 0 (and even tends to a limit as x — 0). On the other hand, 
without any problem of convergence, 


79 Tt is very rare to see Dieudonné deceive himself, but he does so in his Eléments 
d’analyse, Vol. 3, p. 247 & propos the same example. 
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+00 too +00 +00 
/ y (x) log x.dz = v(x) log 2| — | yp(a)dx/a = -{ p(a)dax/ax 
1 1 1 


since v(x) is zero for x large. Finally, 


1 +oo 
T'(y) = [ (le) — o(0)]de/x + | plo)de/e, 


which shows that the derivative in the sense of distributions of the function 
log xz is not what one might have believed. We recommend the reader to redo 
the calculation using an arbitrary point a > 0 as intermediate, and to verify 
that one finds the same result. 


Exercise. For every y € D put 
T(y) = lim p(x)dax/x. 
e—>0 |a|>e 


Show that the limit exists and that T is a distribution. Calculate its deriva- 
tive, and a “primitive”. 
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Introduction to the Lebesgue Theory 


In Volume IV, Chapter XI we shall expound in detail the theory invented 
in about 1900 by Henri Lebesgue, which all mathematicians have used for 
a long time. One can, nevertheless, give a good idea of it in fifteen or so 
pages since, apart from Dini’s Theorem, the arguments used are technically 
very simple: the usual operations of set theory (including the generalities of 
Chap. I on countable sets), very simple inequalities (the triangle inequality, 
the generalities of Chap. II, n° 17 on infinite limits), the definition and prop- 
erties of upper bounds and of absolutely convergent series. The difficulty is 
not in proving the theorems; it is to present them in a logically coherent 
order, as in any theory when one wants to reach difficult theorems starting 
from almost nothing. We shall adopt the method perfected about fifty years 
ago by N. Bourbaki. 

In what follows we shall write X for the set on which we intend to develop 
the integration theory; X is then a locally compact subset of C, for example 
an interval of any kind in R or, in the general case, the intersection of an open 
and a closed set in C. We shall write L(X) for the set of complex continuous 
functions defined on X and zero outside a compact subset of X; a positive 
measure on X is thus, by definition, a linear map 


pw: D(X) CC 


such that u(f) > 0 for every f > 0. The most important case at our level 
is naturally that of the usual Lebesgue measure on an interval of R, but to 
restrict oneself to this does not simplify anything in the proofs or their state- 
ments. Recall that for every function f with complex values the symbol |f| 
denotes the function 7% |f(x)|. 


(i) Integration of Isc functions . As we saw in n° 31, we can immediately 
define the upper integral y*(y) of a real Isc function on X so long as we 
assume 9 positive outside a compact K CX; we shall write %(X) = ¥ for 
the set of these functions. The upper integral 


u(p)= sup pf) 
f<e,fEL(X) 
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is always defined for y € %, and we have —co < p*(y) < +oo. The whole 
Lebesgue theory can be constructed with the help of these functions and of 
their upper integrals. Their properties are strictly the same as in n° 11: 


(L 1) We have u*(p +p) = u(y) + e*() for any 9, € F. 
(L 2) If ® Cc F is an increasing philtre 


(1) (sup y) = sup p*(p) 


and in particular u*(sup y,) = sup *(~n) for every increasing sequence. 


(L 3) We have p*(S> on) = XS u* (Yn) for any positive py € F. 
(L 1) and (L 2) are proved as in n° 11, using Dini’s theorem (n° 30). 


(ii) Measure of an open set. For every open set U in X one puts 
w(U) = w*(xu) 


where the function yy, equal to 1 on U and to 0 on X — U, is Isc since U 
is open in X. The statements (i’), (ii?) and (iii’) of n° 11 are still valid here 
because they are mere translations of (L 1), (L 2) and (L 3). Note that if X 
is not compact then y*(U) can take the value +oo. 


(iii) Upper integral of a positive function Now consider a function f on 
X, with values in [0,-++oo]. There exist functions y € Y% such that y > f, 
for example the function everywhere equal to +co. So we define the upper 
integral of f by putting 


(2) u"(f) = inf w"(p) < +00. 

e2f 
Despite the notation, this definition is not identical to that of n° 1: for the 
Dirichlet function one has y*(f) = 1 in the Riemann theory, but u*(f) = 0 in 
the Lebesgue theory as we shall see below. If f is lsc and a fortiori continuous, 
the definition (2) provides the same value as (1) since f appears among those 
y € ¥ which majorise f. Another trivial point, always useful, is that 


f<g=uw'(f) <u (Q)- 


For every set A C X one similarly puts 


(3) u*(A) = uw" (xa), 


the upper integral of the characteristic function of A; as in the case of open 
sets, the properties of the measures of sets will be obtained immediately, at 
the end of this Appendix, by applying to their characteristic functions the 
statements valid for arbitrary functions. For the moment let us just note that 
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AC B= w'(A) < p(B). 


For a function f with complex (or even vector) values one puts®? 


(4) M(f) = w(IFI) < +00. 


We always have 


(5) Ni(f +9) < Mi(f) + N1(9); 


this is clear if one of the terms on the right hand side is infinite; in the 
opposite case, and by definition, for any r > 0 there exist Isc functions y and 
w such that 


lfl<e wa) se'f)tr lglsv wl) < wv (gl) +7; 


then |f + g| <¢+¥, whence, from (L 1), 


Mf+g) <#(et+y) =H (e) +e Ww) sw) + (gl) + 27, 


qed. 
We also have 


(6) Mi(Af) = |AINi(f) 


for every scalar \ so long as we agree, as everywhere in this context, that 
0.+00=0 


since N,(0) = 0. This convention must particularly be respected when cal- 
culating a product fg of two functions: if for example we have f(x) = 0 and 
g(x) = +00 for an x € X we must agree that fg is zero at the point x. 

If one denotes by ¥'(X; 1) = ¥' the set of complex functions such that 
N,(f) < +00 one obtains a vector space on which the function Ny is a norm 
— up to one “detail”: the relation Ni(f) = 0 does not imply f = 0. 

The first important statement is the following: 


(L 4) If f(a) = > fn(x) < +00 ts the sum of a series of positive functions, 
then 


(7) wh < ye Ge). 


There is nothing to prove if the right hand side is infinite. If it is finite there 
exists, for r > 0 given, and for every n, a function yy, € ¥% satisfying fy < Yn, 
LO (Yn) < w* (fn) + 7/2”: this is definition (2); the function y = > Yn is Isc, 
it majorises f, and, from (L 3), 
®° Ni(f) is often written ||f||, in spite of the fact that Ni(f) = 0 does not imply 


f = 0. We shall also use the notation Ni(f) = u*(f) for functions with values 
in [0, +00]. 
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Lu (f) = So ut(en) < Sole (fn) 7/2") =r + So (Fn), 


qed. Here we note the appearance of a “trick” apparently unknown before 
Borel and Lebesgue, despite its simplicity: to use the formula r = )*> r/2” to 
estimate the sum of a series in a controlled way. The only importance of the 
numbers 1/2” is that their sum is 1. (7) implies 


(7’) uw" (J An) < Sou (An) 


for any sets A, C X, because the characteristic function y of the union of 
the A, is majorised by the sum of the characteristic functions y» of the An. 
Do not believe that you will obtain an equality if you assume that the Ay, 
are pairwise disjoint: for this you have also to assume that the A, are “mea- 
surable”, as we shall see. 


(iv) Sets of measure zero. The relation (15’) suggests the notion of a set 
of measure zero or negligible, i.e., such that 


(8) u(N) = w"(xn) = 0. 


This is equivalent to requiring that, for every r > 0, there exists an open set 
U in X such that 


(8’) NcU & p(U)<r. 


First of all, (8’) implies (8) since u*(N) < p*(U). If, conversely, N satisfies 
(8), there exists for every r > 0 ay € ¥ such that y > xn and p*(~) <7; 
the relation y(x) > 1/2 defines an open U D N for which yy < 2y, whence 


u(U) = p*(xu) < 2u*(y) < 2r 
and (8’). 


(L 5) Every subset of a negligible set is negligible; the union of a finite or 
countable family of negligible sets is negligible. 

Use (3) and (7’). 

For the usual Lebesgue measure (8’) shows that a singleton set is negli- 
gible. Then so is every countable set, for example the set of {a € X} with 
rational coordinates, even though this set is everywhere dense in X. But the 
converse is false. The most famous counterexample (see the end of this (viii)) 
is Cantor’s triadic set in X = [0,1] consisting of the « € X which can be 
written in base 3 enumeration without using the digit 1. For all that, one 
cannot deduce from this that every union of negligible sets is negligible: if 
such were the case, every set would be negligible, being the union of singleton 
sets! Countability is essential in (L 5). 

When one has a relation P{x} depending on a variable x € X, for example 
f(x) => g(x), one says that P{x} is true almost everywhere (a.e.) if the set 
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of « such that P{x} is not true is of measure zero. If one has a finite or 
countable family of statements P,,{x} and if each of them, taken separately, 
is true almost everywhere, i.e., outside a negligible set N,, then they are 
simultaneously true outside N = (J Np, so almost everywhere, from (L 5). 
For example, the sum of a series of functions that are zero almost everywhere 
is zero almost everywhere, similarly the product of two almost everywhere 
zero functions, their upper envelope, etc. 

The set of complex-valued functions that are almost everywhere zero, 
or, as one says, negligible, is thus a vector subspace NV of ¥!. One might 
prefer never to meet these functions, known to exist only in the sense of 
mathematical logic, but it is not generally the theory of integration which 
allows one to eliminate, even less to exhibit them. They serve essentially 
to camouflage the horrors “of no importance” since they do not count in 
calculating integrals: even in the simple Riemann theory one knows that if 
two regulated functions are equal outside a countable set their integrals are 
equal (n° 7, Theorem 7). 


(L 6) Let f be a function with complex values; then 
(9) Ni(f) =0 => f(z) =0 ae. 
If f is a function with values in [0, +00] then 
(10) Ni(f) < +00 ==> fle) < +60 aie. 


Assuming p*(|f|) = 0, for every integer p > 1 consider the set N, = 
{|f(x)| > 1/p} and write x, for its characteristic function. We have |f| > 
Xp/p, whence p*(Np) = L* (Xp) < pu*(|f|) = 0. Being the union of the N, 
the set N = { f(x) 4 0} is of measure zero, by (L 5). Assume conversely that 
N is of measure zero, and now put 


Np = {p< |f@)|Spt1}coNn 


for every p > 0, and let x, be the characteristic function of N,. It is clear that 
lf |\Xp < (P+ 1)xXp whence p*(|flxp) < (p+ 1)u* (Xp) = 0 since N,, contained 
in N, is of measure zero, by (L 5). Since |f] = 5° |f|xp we have py*(|f|) =0 
by (L 4), whence (9). 

To prove (10), put A, = { f(x) > p} for every p > 1 and again let x, be the 
characteristic function of Ap. We have f > pyp, whence p*(xp) < p*(f)/p. 
Since the set N = {f(x) = +00} is the intersection of the A,, we obtain 
u*(N) < w*(f)/p for every p, whence p*(N) = 0 if p*(f) < +00, qed. 

(L 6) shows that if two functions f and g are equal almost everywhere 
then Ni(f) < Ni(g)+ Ni(f—g) = Ni(g), by (9); by symmetry, one sees that 


(11) f=gae. => Ni(f) = Ni(g). 


The number N;(f) thus depends only on the equivalence class of f modulo 
the vector subspace NV of negligible functions; on writing f for this class, ie. 
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the set of all the functions almost everywhere equal to f (Chap. I, end of 
n° 4), we put (but see footnote®?) 


(12) fll: = Na(f). 


The relations (5) and (6) are valid for these classes, and as we have done 
what was strictly necessary for ||f||1 = 0 to imply f(z) =0ae.,ie. f =0, 
we see that the quotient space 


PhS BUEN 
is a true normed vector space (Appendix to Chap. III, n° 5). 
(v) Integrable functions . This done, a function f with complex values will 


be said to be integrable on X for the measure considered if, for every r > 0, 
there exists a continuous function g € L(X) such that 


(13) M(f-g) =H IF -gl) <r 


or, equivalently, if there exists a sequence of continuous functions f, € L(X) 
such that 


(13’) lim Ni(f — fn) = 0; 
if so one defines the integral of f by 


(14) u(f) = iE f(w)du(a) = lim p(fp). 


As in n° 2 of § 1, this limit exists and does not depend on the sequence (f,,) 
chosen. For 


le( fo) — H(fa)| = “(fp — fa)| <M fp — fal) = Ni(fp — fa) 
= NiGp =F) + Nut = fos 


the sequence u(f;,) thus satisfies Cauchy’s criterion. If another sequence (gn) 
of continuous functions satisfies (13’), then similarly 


(fr) — #(9n)| (| fn — 9nl) = Nilfn — gn) 


< 
< Mlfa—f) + Mi(f gn), 
whence lim u( fn) — u(gn) = 0, ged. 

Integrable functions have properties which are almost trivial, and others 
which are less so. Let us start with the first. 

It is immediately obvious that every function f € L(X) is integrable [take 
g = f in (13)] and that its Lebesgue integral is equal to the number p(f), 
the value at f of the linear form py: L(X) — C. 
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If f and g are equal almost everywhere and if f is integrable, then g is 
integrable and yu(f) = u(g) because, in (13’), Ni(f — f,) is unchanged if we 
replace f by g. This allows us to say that a function f defined outside a 
negligible set N C X is integrable if any function g everywhere defined and 
such that f = g outside N is so; one then defines u(f) = L(g). 

(13’) shows further that Ni(f) = lim Ni(fn) = lim (| fn|). Now we know 
that |u(fr)| < uC fn|) since we are dealing with continuous functions (same 
proof as in n° 2, Theorem 1). Thus 


|(f)| = lim |u(fn)| < lim pu(| fn) = lim Ni(fn), 


whence 


(15) MCP) < Nil) 


for every integrable function f. 


(L 7) If f is integrable then so is |f| and 


(16) Mify=atlF) = / Lf (e)|du(). 


To prove this we go back again to (13’) and (14). Using the fact that 
||u| — jv|| < |w— v| for any u,v € C we obtain the inequality ||f| — |fn|| < 
|f — fl; this shows that lim (||| - | fr) = 0; since the functions |f;,| are 
in L(X), |f| is integrable and we have j(|f|) = lim p(|fn|) by definition of the 
integral. Now Ni(f,) converges to Ni(f). Like (16), the definition of Ni(f) 
for every continuous function f applies to the f, so 


M(f) = lim Ni (fn) = lim “(| fnl) = WFD), 
qed. 
(L 8) If f and g are integrable then so are af + Bg for any a, 8 €C and 
(17) maf + Bg) = au(f) + Bulg). 


Let (fn) and (gn) be sequences in D(X) such that lim Mi(f — fr) = 
lim Ni(g — gn) = 0. The triangle inequality 


M[(af + 89) — (afn + B9n)] < lalNi(f — fn) + 1B1Ni(9 — gn) 
proves the integrability of a f+ (g. Moreover, by the definition of the integral, 
af ele Bg) = lim p(afn ole BGn) = lim apt( fn) 1 BuU(Gn), 


whence (17). 


(L 9) Let (fn) be a sequence of integrable functions and f a function such 
that lim Ni(f — fn) =0. Then f is integrable and 
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(18) J flodute) tim f faCe)au(a), 


For every r > 0 we have Ni(f — fn) < 1 for all sufficiently large n; on 
the other hand, for every n there exists a function g, € L(X) such that 
M(fn — gn) < 1/n; for n large we thus have Ni(f — gn) < r+1/n < 2r, 
whence the integrability of f. Moreover, from (L 8) and (15), 


Mf) — ofa) = let — fr S NUP — fn), 
whence (18). 


(L 10) If f and g are integrable and real-valued then inf(f,g) and sup(f, g) 
are integrable. 

Since we know that f — g and |f — g| are integrable, it is enough, as in 
n° 2, Theorem 2, to show that f* is integrable. To do this one uses (13’) and 
the inequality |f* — g*| < |f —g|, valid for any real f and g, qed. 

(L 11) A function y € FY is integrable if and only if u*(~) < +00. If so, 


u(y) = u(y). 
Whether ¢ is or is not Isc, the condition ~*(y) < +00 is necessary. If it is 
satisfied there exists for every r > 0 a continuous function f < y such that 


BF) < wy) S$ wf) +7; 


since p= f+(y—f)=f+ |e—f| and since f and y — f are lsc since f is 
continuous, we have u*(y) = w(f) + p*(|\y — f]) and so 


Ni(y-f)=Kh (le - fl) =H (e) - KS) <r 


from (L 1); whence the integrability of y. The preceding relation also shows 
that 
lu(e) — HAT S Mile — f) = lee) — HI Sr, 


whence pu(y) = u*(p), ged. 


(vi) Convergence in mean: Cauchy’s criterion. We say that the functions 
fn converge in mean to a function f when lim Ni (f — f,) = 0, ie. 


tim f |F(0) ~ fae)|du(x) = 0 
from (L 7); sometimes one writes the preceding relation in the simpler form 


f(a) = Lim. fn(x), 


the limit in mean. 

We shall show that Cauchy’s criterion is valid for convergence in mean; 
this is not so in the Riemann theory and this is one of the fundamental break- 
throughs accomplished by Lebesgue (or rather by his immediate successors). 
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First we prove the following result: 


(L 12) Let (fn) be a series of integrable functions such that 


Xf iinCw)lduta) = 7 Nala) < +00. 


Then > |fn(x)| < +00 a.e., every function f such that f(x) = >> fr(ax) ae 
is integrable, and 


ia Naa ae =O 
[1 ue an )dp(a 
DI f folordula)| < +0. 


We put F(x) = So|f,(x)| < +00; from (L 4) we have Ni(F) < 
S> Mi(fn) < +00, so F(x) < +00 a.e. from (L 6), so that the series > fn (x) 
converges absolutely almost everywhere. Let f be a function almost every- 
where equal to the sum of this series. Since | f(«)| < F(x) a.e. we have 


Mf) = 2 (If) < eUED = M(B) < So Nia): 


If we suppress the first p terms of the given series we replace f by a function 
equal almost everywhere to f —(f1+...+ fp), whence, by the same argument, 


(19) M(f - fa --— fp) < 0M (fa), 


q>P 


which is arbitrarily small for p large. Since f; +...+ fp is integrable, so is f, 
by (L 9), and 


(20) y(f) = lim wl fr +. + fp) = So wl fp). 


The series is absolutely convergent since |/u(fp)| < Ni(fp), qed. 

One should notice the difference between (L 12) and the elementary the- 
orems on term-by-term integration of normally convergent series (n° 4): con- 
vergence (almost everywhere!) of the series }*|fn(a)| is a consequence, and 
no longer a cause, of the relation }> m(|fn|) < +00. 

We can now establish Cauchy’s criterion for the convergence in mean. 


(L 13) (Riesz-Fischer Theorem®!) Jf a sequence (fn) of integrable func- 
tions satisfies Cauchy’s criterion for convergence in mean then it converges 


8! This result shows that the the normed vector space of classes of integrable func- 
tions is complete, i.e. is a Banach space (Appendix to Chap. III, n° 5). His- 
torically, a large part of the theory of Banach spaces has been motivated by 
integration theory. 
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in mean to an integrable function f and one can extract a subsequence dom- 
inated by an integrable function and converging almost everywhere to f. 
Suppose that for every r > 0 


Ni(fs — ft) <r for s and t large. 


We are to prove the existence of an integrable function f such that lim Ni (f— 

fn) = 0. As in every metric space it is enough to show that from the given 

Cauchy sequence we can extract a subsequence which converges in mean: see 

the first proof of the usual Cauchy criterion in Chap. III, n° 10, Theorem 13. 
For every s € N denote by ng the least integer such that 


k>n, &h>ns => Ni(fe— fr) < 1/2°. 
It is clear that ng < 541. So 
(21) Nu (Fog — fins) < 1/28 for every s. 
For the differences 


(22) Is = Friese >: fnes 


which are integrable, by (L 8), we therefore have 5+ Ni(gs) < +00. By (L 12), 
the series 5* g, converges absolutely almost everywhere and its sum g is also 
the limit in mean of its partial sums. But (22) shows that 


(23) gu() + + gs(@) = frogs (@) — fri (2). 
Since the left hand side tends to g(x) a.e., we deduce that 
lim fn, () = g(@) + fni (2) 


exists almost everywhere; if we denote by f the function on the right hand 
side — its values at the points where the limit does not exist can be chosen 
arbitrarily —, we have 


(24) GI-M7 2 7 Is = f Press a.e. 


by (23), so 


lim Ni (f — Savas) =0, 


by (L 11). We have thus extracted from f,, a subsequence which converges 
to f both almost everywhere and in mean. It remains to show the existence 
of an integrable function p > 0 such that 


lfn.(#)| < p(2) 


for every s and every x. But (24) shows that 
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fneaal S1f/+ 191+ 2 Ign 


and we know that f, g and the sum of the series }* |g,| are integrable, qed. 

The statement (L 13) applies in particular to every sequence (f;,) which 
converges in mean since such a sequence trivially satisfies Cauchy’s criterion. 
It does not follow that it converges almost everywhere: we know only that we 
can extract a subsequence converging a.e. But if, for some other reason, one 
also knows that lim f,(x) = g(a) exists a.e. then g is necessarily the limit in 
mean of f,,. Indeed, the limit in mean f is actually the limit a.e. of a sequence 
Gn extracted from the given sequence; if lim f,(x) = g(x) outside a negligible 
set N and lim g,(x) = f(x) outside of negligible set N’ then f(x) = g(x) 
outside NJ N’. In other words: 


(L 14) If a sequence of integrable functions fn converges in mean to a 
function f and almost everywhere to a function g, then f = g almost every- 
where. 

Since every integrable function f is, by definition, the limit in mean of 
functions belonging to L(X), the Riesz-Fischer theorem shows the existence 
of a sequence (f,) of continuous functions of compact support such that 
f(x) = lim f,() almost everywhere; one can even assume the f,, dominated 
by an integrable function. 

Exercise. Let f be an integrable real function. (i) Show that there are 
fn € L(X) such that 


>> Ni(fn) < +00, )> fn(a) = f(x) a.e. 


(ii) Put y'(z) = >> f(x) and yp" (x) = > f, (x). Show that y’ and y” are 
Isc, integrable, and that f = y’ — y” almost everywhere (compare with note 
23 of n° 11). 


(vii) Lebesgue’s grand theorem. Theorem (L 19) below, without doubt the 
most useful of the whole theory, is the definitive version of the “dominated 
convergence” theorem of which we gave a pale unproven glimpse in n° 4. 


(L 15) Let (fn) be an increasing sequence of integrable real functions. 
For f = sup fn = lim fn to be integrable it is necessary and sufficient that 
sup ulfn) < +00. If 80 


lim Ni(f — fn) =0, wf) = lim u(fn). 


The necessity of the condition is clear since f,, < f for every n. If it is satisfied 
the relation 


Ni(fg = fp) = LW fa = Sp) - L( fa) = LU fp), 


valid for p < q by (23), shows that (f,,) is a Cauchy sequence for convergence 
in mean. It therefore converges in mean, necessarily to the function f(x) = 
lim f,(%) by (L 14), qed. 
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(L 16) Let (fn) be a countable family of integrable real functions; for the 
upper envelope f(x) = sup f,(x) to be integrable it is necessary and sufficient 
that the fp, be dominated by a function with finite upper integral. 

If f is integrable, it is clear that the condition is satisfied. If con- 
versely we have f, < p where p*(p) < +00, the partial upper envelopes 
Gn = sup(fi,..-, fn), which are integrable by (L 10), form an increasing se- 
quence whose integrals are all < yu*(p); since sup(f,) = sup(gn), we need 
only apply (L 15). 


(L 17) Every decreasing sequence (fn) of positive integrable functions 
converges in mean to the integrable function f(x) = lim f,(2). 

Recall first that in R every decreasing sequence with positive terms con- 
verges (even when all the terms are equal to +00) and therefore satisfies 
Cauchy’s criterion if its limit is finite. This said, for p < q we have 


Ni(fa > fp) ma HM fa ~ fol) = Ufa = Sp) = LM fa) = UW fp) 


by (L 7) and (L 8). Now the sequence p(f,) € R is decreasing and has positive 
terms, hence converges; so Ni(f, — fp) <r for p and q large. The function 
f(x) = lim f,(x) is almost everywhere finite since u*(f) < u*(fn) for every 
n; it therefore converges almost everywhere to f; by (L 14) this is the limit 
in mean of fy, qed. 


(L 18) The lower envelope of a countable family of positive integrable 
functions is integrable. 
Apply (L 17) to the functions inf(fi,..., fn). 


(L 19) (Lebesgue’s dominated convergence theorem) Let (f,) be a se- 
quence of integrable functions which converges almost everywhere to a func- 
tion f. Assume that there exists a positive function p such that 


bu (p)<+0o & |fn(x)| < p(x) ae. for every n. 


Then f is integrable and 


f(a) = Lim. f(x), / f(w)dps(xe) = lim / fa(a)du(e). 


We first show how to write the usual Cauchy criterion in a form adapted 
to the following proof. 

If (un) is a sequence of complex numbers, this says that, for every r > 0, 
there exists an integer n such that |u; — u;| < r for any i,j > n. By the 
definition of an upper bound this is equivalent to the relation 


If then one puts 
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sup |w; — uj| = Un 

igj2n 
for every n, one obtains a decreasing sequence of positive numbers and 
Cauchy’s criterion is equivalent to the relation 


lim v,, = 0. 


This established, let us prove (L 19). By (L 14) it is enough to show that 
(fn) is a Cauchy sequence for convergence in mean. The sequence of functions 


Gn = sup (|fi— fl) < 2p 
t,j2 
is decreasing. Since Nj (|fi— f;|) < Ni(gn) for any i,7 > n, the proof reduces 
to showing that the gn converge in mean to 0. By Cauchy’s criterion in R, 
9n(x) converges to 0 at every x where lim f,(a) exists, so almost everywhere. 
It is then enough, by (L 14), to show that the g, converge in mean, which 
(L 17) guarantees so long as the g, are integrable. Now g, is the upper 
envelope of the countable family of the | f;—f;| (¢,7 = n), and these integrable 
functions are dominated by the function 2p, whence the result by (L 16), qed. 


(viii) Integrable sets. A set A C X is said to be integrable if its character- 
istic function is integrable; then one puts y(A) = u(x). If A and BC Aare 
integrable then so is A — B, by (L 8), and 


u(A— B) = p(A) — p(B). 


If A and B are integrable so are AN B and AUB, by (L 10). If A and B are 
equal up to a negligible set, and if one of them is integrable, then so is the 
other. 


(L 20) Every open or closed set A such that u*(A) < +00 is integrable. 

The first case follows from (L 11). The second follows from the first if one 
shows that there exists an open integrable U D A, since then A = U—(U—A) 
where U and U — A are open and have finite outer measure, so are integrable. 
In fact, more generally, 

w'(A) = inf w"(U) 

for every set A. This is clear if the left hand side is infinite. If it is finite, there 
exists for every r > 0 an Isc function y > x4 such that p*(A) < p*(y) < 
p*(A) +r; the open sets U, = v(x) > 1-—1/n contain A for every n and 
u*(y) 2 (1 eq 1/n)u*(Un), whence 


(1—1/n)u"(Un) < w(A) +9 


and therefore u*(U,) < u*(A) + 2r for n large, qed. 

In particular, the set X itself is integrable if and only if w*(X) < +00; 
since the function 1 is the upper envelope of f € L(X) such that 0 < f <1, 
and since |f| < ||f||1 for every f € L(X), we then have 
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for every f € L(X); the measure p is therefore bounded or of finite total 
mass. If, conversely, we have a bound |u(f)| < M||f||x then it is clear that 
p(1) <M. 

We also see (in the general case) that every compact set is integrable. 


(L 21) The intersection A=(]An of every countable family of integrable 
sets is integrable. 

The characteristic function x of A is of course the lower envelope of the 
functions y, of the A,, whence the result by (L 18). If the sequence (A,) is 
decreasing, then, by (L 17), 


u(() A,,) = lim p(An) = inf u(An). 


(L 22) For the union A=) Ap of a countable family of integrable sets to 
be integrable it is necessary and sufficient that there exists a set B such that 


(25) p(B) <+oco and A, C B for every n. 


Necessity is clear: choose B = |) An. If it is satisfied the characteristic 
functions of the A, are dominated by the function yg, whence the result by 
(L 16). 

Further, 


u((_J An) < $5 m(An) 


by (3’), with equality if the A, are pairwise disjoint, by (L 12). 
When the sequence A, is increasing (L 15) allows one to replace the 
condition (25) by sup 4(A,) < +00; then 


H(A) = lim p(A,) = sup p(An). 


Let us show for example that the Cantor set C' is of measure zero (for 
Lebesgue measure on R). This set is constructed by removing from [0,1] 
its middle interval ]1/3,2/3[, then from each of the two remaining intervals 
their middle interval, then from each of the four remaining intervals their 
middle interval, and and so on indefinitely. The total sum of the lengths of 
the excluded intervals is equal to 


1/3 + 2/3? + 27/39 +... =1, 


and since they are pairwise disjoint, we have m([0,1] — C) = 1, whence 
m(C) = 0. 


(viii) Measurable sets. We say that a set A C X is measurable if AN K 
is integrable for every compact set K Cc X. It is clear that every integrable 
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set is measurable, as well as X, or the complement of any measurable set, or 
the union and intersection of a countable family of measurable sets. 

Every open or closed set M is measurable; if M is closed, MNK is compact 
and so integrable for every compact kK C X, so that M is measurable. If WM 
is open then X — M is closed, so measurable, hence M is too. 

These results allow us, starting from open sets, closed sets and from sets 
of measure zero, to construct extraordinarily complicated measurable sets: 
countable intersections of countable unions of countable intersections of open 
sets, for example. In fact, the difficulty is rather to construct nonmeasurable 
sets explicitly, a practically impossible task without using transfinite induc- 
tion as in Chap. I in some form or another. In current practice one has no 
chance of meeting nonmeasurable sets; even so this is no excuse for evading 
proofs of measurability ... 


(L 23) For a measurable set A to be integrable it is necessary and sufficient 
that p*(A) < +00. 

Assume we have proved that X is the union of a countable family of 
compact K,; since the AN Ky, are by hypothesis integrable, the set 


A=(J4nk, 


is then integrable by (L 22). It remains to prove the existence of K,,. It is 
obvious if X is an interval in R. If X is an open subset of C let D be the 
set, countable, of x € X with rational coordinates. For every x € X there 
exists an n such that the closed disc B(w,1/n) is contained in U, then a 
d € D such that |x — d| < 1/2n; the closed disc D(d,1/2n) is then contained 
in B(a,1/n) C X and contains x. This shows that X is the union of a 
countable family of compact discs of the form B(d,1/p), whence the result. 
If finally X = UN F with U open and F closed, and if U = U Ky, then 
X =UKnNF, aed. 

The preceding result shows that the notion of measurable set does not 
differ from that of an integrable set except when p(X) = ++oo. 

To conclude, let us give a characterisation of integrable real functions 
which will lead us back to Lebesgue’s original point of view: 


(L 24) Let f be a real function such that Ni(f) < +00. For f to 
be integrable it is necessary and sufficient that, for any a and b the set 
{a < f(x) < b} should be measurable. 

To establish the necessity of the condition one chooses a sequence of 
functions f, € D(X) and a negligible set N such that f(x) = lim f,(x) for 
every x € X — N. Since 


[a,b] = ( la—1/p,b+ 1/pI, 
the relation a < f(x) < b means that for every p one has 


(26) a-—1/p< fr(x)<b+1/p for every sufficiently large n. 
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For n and p given (26) defines an open U,,, and the « € X — N such that 
(26) is satisfied for every n > q are the points of the measurable set 
Up.q 1 Up qi 1... = Ap,g: 
For p given the x satisfying (26) are thus the elements of the measurable set 
Ani VUAgs U.. = Bp 


Finally, to say that (26) holds for every p means that x belongs to the mea- 
surable set B = () Bp. Since the set defined by the condition a < f(x) < b is 
equal to B to within a negligible set, it is measurable. 

Suppose conversely that the set {a < f(x) < b} is measurable for any a 
and b, and assume first that f > 0. For n,p > 1 let us put 


Since 
[a, [= Ula, b—1/q) 


the set A,,, is the union of a countable family of sets of the form {u < f(x) < v}, 
so is measurable. If yn,» is the characteristic function of An, then Xn,p < 
nf(x)/p; since we have assumed Nj(f) finite the function x,,, is integrable 
by (L 23), and so also is the function 


fat) = yo Exn- 


1<p<n? 


We have f,(a) < f(x) for any n and 2, as well as 


f(a) — fale) <A/n if fle) <n 


as a figure, that of n° 30 for example, will show better than a calculation. 
It follows that f(x) = lim f,(x) for every x, whence the integrability of fT, 
by the dominated convergence theorem. Finally, if f is not positive, apply (L 
24) to f* and f~, qed. 

If one accepts that every reasonable set is measurable, it follows that in 
practice all the functions one meets in classical analysis are integrable, so 
long, of course, as Ni(f) < +00. 

Exercise. Assume that X is an interval R, choose an a € X, and, for every 
x € X, define 

w(fa,z]) if vx>a 
Ts { —p(ja,z[) if a<a. 


Show that p(z) is increasing, right continuous, and that, for every f € L(X), 
u(f) is the Stieltjes integral of f with respect to the function p(x). 
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8 1. Truncated expansions —§ 2. Summation formulae 


§ 1. Truncated expansions 


1 — Comparison relations 


Recall that in Chap. I, n° 3 and 4, we introduced relations which, given 
scalar functions defined on a set X C R, allowed us to compare their “orders 
of magnitude” on a neighbourhood of a point a adherent to X, the case 
a = +00 or —oo not excluded, quite the contrary. These are the following: 


(1.1) f(x) = O(g(2)) when x — a, 


which is equivalent to the existence of a constant M > 0 such that | f(x)| < 
M|g(a)| for every x € X near to a; 


(1.2) f(x) X g(a) when x — a, 
which means that simultaneously f(a) = O(g(x)) and g(x) = O(f(2)); 
(1.3) f(a) ~ g(a) when x — a, 


equivalent to 
lim f(v)/g(2) = 1 


and finally 


(1.4) f(a) = o(g(2)) when x > a, 


which is equivalent to lim f(x) /g(x) = 0. 
We also saw that (3) can be expressed as 


f(x) = g(a) + o(g(a)) 
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since |f(x)/g(x) — 1] < € can be written as | f(a) — g(x)| < elg(z)|. 
Chapter IV provided us with formulae of this type applicable to the power, 
exponential and logarithmic functions: 


(1.5) ze = O(#"), c>0 <> Re(b) > Re(a), 
(1.6) zg’ = o(x*), «—0 <> Re(b) > Re(a), 
(1.7) z® = O(a"), — +00 => Re(b) < Re(a), 
(1.8) ze = o(x*), c—+c0o <> Re(b) < Re(a), 
(1.9) x = o(a”), © +00 ifa>1, sEC, 
(1.10) logz = o(#*), +00 if Re(s) > 0, 
(1.11) logx = o(1/a*), «0 if Re(s) > 0. 


It is useful to remember the three last formulae as: 


lma*2* = 0 when x > +00 ifa>1,s€C, 
lima*logz = 0 when x — +00 if Re(s) < 0, 
lima*logz = 0 when x — 0 if Re(s) > 0. 


The theory of power series and Taylor’s formula provide other general 
results. For a power series 


F(t) = ane” +... + dpe? + ape" +... 
where we write only the nonzero terms, we have 
(1.12) FUE) agi BF sot PO 8” ) ape” when x > 0 


since the left hand side can be written a,-7"(1+?a +...) with a series which 
tends to 1; this leads to formulae such as 


ef = l+e4+27/2+27/6 + o(x*), 
log(l+2) = g—27/2+2°/3-24/4+0(2°), 


etc. when x — 0. As for Taylor’s Formula, this shows that if a function f is 
of class C+! on a neighbourhood of a point a € R, then 


(1.13) f(ath)=f(a)t fi (@ht...+ f™(a)h™/n! + O (n+) 

as h > 0 and even 

(1.14) flath)— [f(@) + filaht...+ £0 (@)h" [nd] ky 
nw fD(a)a"t4 /(n +1)! 


if f+) (a) 4 0 [Chap. V, eqn. (18.11) and (18.14)]. 
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2 — Rules of calculation 


The symbols O and o obey rules of calculation which are easy to remember 
and also to prove, with the exception of those which apply to the quotients of 
asymptotic relations. We shall restrict ourselves to stating them in telegraphic 
style — there is little point in discussing them at length when we have already 
used them on several occasions in the preceding chapters — with minimal 
indications of their proofs. 


(2.1) f=O(9) & g=O(h) => f =O). 
For if |f(2)| < Alg(#)| and if |g(z)| < Blh(2)|, then |f(x)| < ABIA(2)|. 
(2.2) f= O(g) & g=olh) => f =o(h). 


For if | f(a)| < Alg(x)| and if |g(a)| < r|h(x)|, then | f(a)| < e|h(x)| provided 
r<e/A. 


(2.3) f = OF) & g=O(h) = ftg=O(h). 
(2.4) f = oth) & g=o(h ftg=o(h). 
(2.5) ff = Off) & fl =O(g") = ff! = 0(9'9"). 
(2.6) ff = O(9') & f" =0(9") = ff" = o9'9"). 


One might also write some of these rules in the following way, remem- 
bering the fact that a symbol such as O(g) denotes any function f such that 
f = O(g): 


Example 1. Let us multiply term-by-term the relations 
ef s1+e+27/2+ O(2°), sing = 2 — 2°/6+ O(2°) 
valid for x — 0; calculating a la Newton one finds 


esing = (l+a2+27/2)\(2—2°/6)+(1+2+27/2)O(2°) + 
+ (x — x? /6)O(2*) + O(2?)O(2°) = 
= ota? +29/3 — 24/6 — 2° /12 + O(c") + O(2*) +...+ Ofz*); 


but as 2” = O(2*) for n > 4 we have 
sing =a2+a7+2°/3+ O(c); 


one cannot derive anything more precise starting from these relations. 
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Example 2. When x — 0, 
(a* + a3 = 77/8 (1+ 2/8 = 77/3 [1+ 27/3 - x /9+ O(x°)] 
by the binomial series, whence 
(a* + a?) = 7/4 8/3 /3 - x'4/3 /9 + O(«79/), 


In these calculations we have used the fact that x*O(x) = O(x**), a par- 
ticular case of (5). 
There are also rules concerning the relation f ~ g. 


(2.7) frg & grwh=frk.. 
For f =g+0(g) =h+ o(h) + o(h+ o(h)) = o(h) + o(O(h)) = h+ o(h). 
(2.8) na Sy g & fe ey g” = ff" Pe gg" and Vee ie Rs g/g". 


For f’ f’/g'g" = (f'/9')(f"/g9"), the product of two ratios tending to 1, etc. 
Or simply multiply the relations f’ = g’ + o(g’) and f” = g” + o(g”). 


Example 8. Consider the ratio 


x?—a2+logez 
x? — (log x)? 


as x tends to +oo. In the numerator, x and log z are o(x), so it is ~ x7. In 
the denominator, log x is o(x), so (log x)? is o(x?), so that the denominator 
also is ~ x”. The fraction we are considering therefore tends to 1 as x — +00. 
As we have already noted elsewhere, at infinity a polynomial is equivalent 
to its term of highest degree; a rational fraction is therefore equivalent to the 
quotient of the terms of highest degree in its numerator and denominator. 


3 — Truncated expansions 


The preceding examples — and more so those which follow — show that to 
study the behaviour of a function on a neighbourhood of a point a, it is 
useful to compare it to functions as simple as possible. If for example the 
function is represented by a convergent power series in x—a, one compares it 
to the partial sums of the latter, i.e. to polynomials. In general, and even in 
the most elementary situations, it is necessary to choose comparison functions 
a little less simple. 

Assume that one wishes to study the behaviour of a function f near 
x = 0, or only when x — 0+. It may happen that there are a constant 
a # 0 and a real exponent s such that f(x) ~ ax*. Then — by definition — 
f(x) = ax*+o(x*), which encourages us to consider the difference f(x)—az*. 
It may happen that there are a constant b # 0 and a real exponent t such 
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that f(x) — ax* ~ ba; necessarily t > s. It may then happen that there are 
a constant c #0 and a real exponent u >t such that f(a) — ars — bxt ~ cx", 
and so on. 

In this context we shall call a generalised polynomial any function of the 
form 


(3.1) p(w) = aya*t +... + ay a*” 


where the az are nonzero constants and where the real exponents sx satisfy 
8, < 82 <... < Sy; 


we shall then say that f admits a truncated expansion of order s at the origin 
if there exists a generalised polynomial p such that 


(3.2) f(x) = p(x) + o(a*) when x > 0. 
Since x? = o(x°) for t > s, we may assume that, in (1), 
(3.3) 81 < 89 <<... <8 <8; 


we then say that p is the principal part of order s of f at the origin. 
It is unique, for (2) clearly implies 


f(a) = a,x"! + 0(a**) ~ aya*? 


and so 
a, = lim f(x)/x*?, 


which determines a1; the higher coefficients are obtained similarly from the 
relations 
f(a) — aya?! — 22. — apn” ~ apg art!, 


It is clear that if one has two expansions f(x) = p(x) + o(#*) and g(x) = 
q(x) + o(a*) of the same order, then adding gives a truncated expansion of 
f +g. If the orders are different, naturally it is the smallest which is valid for 
the sum: from the relations 


e=1l+ar+27/2+27/6+0(2°), cose =1—2?/2+ 24/24 +4 o(2°) 
one can deduce no more than 
e* +cosx=2+2+2°/6+4 0(x?) 


since it might be that e” has a truncated expansion of order 5 containing 
nonzero terms in «* and 2°... 

It is also easy to multiply truncated expansions term-by-term. An example 
will suffice to indicate the method. One writes 
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(3.4) e*cosz = [L+a+27/2+2°/6 + o(x*)] . [1 —2?/2+ 24/24 + o(2°)| 


and multiplies mentally term-by-term; first one sees the terms of the form 
ax*, then the terms of the form az‘o(a#') = o(a#***), and finally a term 
o(x*)o(a") = o(a***). Among the terms of the form o(2“) only that or those 
having the smallest exponent wu are retained since all the others are themselves 
o(a“); and among the terms of the form ax*, only those of exponent s < u 
are to be retained, for the same reason. In the case (4), obviously u = 3 
because of the product o(x?).1, so that 


e*cosz =1+2-—2°/3+4 0(z°) 


without needing to calculate any more terms. The fact that the exponents 
may be neither integers nor positive does not change the method at all, since 
it rests on the fact that 2 = o(x”) for a > b when = tends to 0. 


4 — Truncated expansion of a quotient 
Suppose we are given two truncated expansions 


f(x) =p(z)+o(2*), — g(@) = a(x) + ofz") 


on a neighbourhood of « = 0 and seek to deduce the most precise truncated 
expansion possible of h(a) = f(x)/g(a). Let ba® be the term of lowest degree 
in the generalised polynomial g(x); then 


h(x) = fil@)/gr (x) 


where the truncated expansion of g;(a) = b-'x~ g(a) begins with the mono- 
mial 1 and is of order t— a, and that of f;(2) = b-'x~° f(x) is now of order 
s—a. So we reduce to finding a truncated expansion of 1/g in the case where 
gi is of the form 


(4.1) gi(xz) =1— r(x) + o(2"), r(a) = boa? +... + dpe”, 
with nonzero coefficients and exponents satisfying 

(4.2) 0O<ug<...< Un <U. 

On putting go(x) = r(x) + o(a"), whence g; = 1 — go, one has 
(4.3) 1/gi(x) =1+ ga(x) +... + go(2)"~* + go(x)"/g1(2) 


for every integer N > 0. The function g2(x)* is a sum of monomials whose 
degrees are of the form kgu2+...+kntn with > k; = k, kj > 0, and of similar 
monomials where at least one of the factors is o(a”), so are themselves o(2"). 
Since o(a™) already figures in the second term g2(x) of (3) one cannot hope 
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to deduce a truncated expansion of the left hand side of order > u from (3). 
The last term of (3) is equivalent to its numerator since gi(2) ~ 1; in the 
numerator, go(x) is equivalent to the term of lowest degree of r(x), whence 


(4.4) go(2)" /gi(a) ~ aN? = BN aN + 0 (aN), 
(3) thus yields a relation of the form 
1/g.(x) =1+ cox? +... + 0(2%) + NaN +0 (er) ; 


where the inexplicit terms are of degrees > ue. 

If Nug < u, the term o(#") is negligible with respect to the second o 
and we have only obtained a truncated expansion of order Nuz < u; if on 
the contrary Nuz > u, the last term of (4) is itself o(x“). Since we have no 
wish to spend our energy in vain, nor to lose information, we must choose for 
N the smallest integer > u/uzg: going further adds only terms all negligible 
with respect to x“ arising from the powers of go(x), while not going so far 
diminishes the order of the expansion we obtain. 

The method is general, but, to gain an understanding, it will be better 
to remember the principle and apply it to examples. 


Example 1. We seek a truncated expansion of order 1 at x = 0 of the function 
h(x) = e*/x? sinx. Here h(x) ~ x~3, so that a relation of the form h(x) = 
p(x) + o(x) can be written as x°h(x) = q(x) + o(a*). We have to find a 
truncated expansion of order 4 for 


x°h(x) = e* /(sinx/2), 


so for 

(4.5) e =1l+ae4a7/24 23/6 + 24/244 o(x*), 

for 

(4.6) sina /x = 1— 27/3! 4 24/5! + o(a*) = 1 — r(x) + o(a*) 


and for the reciprocal of (6). Since r(x) is ~ 2? up to a constant factor, its 
square is ~ x* and its cube ~ x® = 0(a*). One may simply write 
a/sing = 1+ (27/3! — 24/5!) + (27/3! — x*/5!)° +o(z*) = 
1+ a°/3! + [(1/3!)? — 1/5!] 2* + o(2*) = 
= 1427/64 7x*/360 + o(2*). 


I 


It remains to multiply by (5) and by x~%, which gives 
e* /x* sing = 
=a? (l+a+27/2+ 23/6 + 24/24) (1+ 27/6 + 7x*/360) + o(a) = 
(4.7) = 1/2? + 1/22 4+ 2/32 +1/3+4 2/5 + o(z). 
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This formula provides very precise information on the behaviour of f(x) = 
e”/x* sinz as x tends to 0. To a first approximation, f(a) ~ 1/x3, which 
means that the ratio between the two members tends to 1. But their difference 
increases indefinitely, and in a precise way is ~ 1/x?, which does not prevent 
the difference f(x) — 1/x? — 1/x? from growing indefinitely and, in fact from 
being ~ 2/32; this time, the difference f(a) — 1/x? — 1/x? — 2/3a tends to 
1/3, ete. 

We remark, to conclude these generalities, that in practice one does not 
confine oneself to using the power functions x°*; frequently, particularly when 
examining the behaviour of a function “at infinity”, one has to compare 
a given function with the functions e** or logz, loglogz, etc, and more 
generally with functions of the form e**2' log” xz, mainly when having to 
determine the convergence of an integral on an interval unbounded to the 
right (Chap. V, n° 22). The idea is always to order these monomials by 
order of decreasing magnitude, so that, in a sum of monomials, each term is 
negligible with respect to the preceding term. 


5 — Gauss’ convergence criterion 


The Unii/Un criterion allows us to determine whether many simple series 
converge, but, as we know, it is inconclusive if the ratio tends to 1. In his 
research on the hypergeometric series, C. F. Gauss obtained a very useful 
result for this case; the proof rests on simple, but ingenious, asymptotic 
evaluations. 


Gauss’ convergence criterion. Let }\u, be a series with positive terms 
and suppose that there exists a number s such that 


(5.1) Un41/Un = 1—s/n+O(1/n’?). 
Then the series converges if s > 1 and diverges if s <1. 


Note that in this case the d’Alembert ratio tends to 1. It is easy to re- 
member the criterion: remember that for the series to converge it is preferable 
for the ratio not to tend to 1 too rapidly, in other words that s it should be 
greater than a certain limit, namely 1. 

First let us present two examples — we need them in the proof — of series 
for which we have a relation (1). For the series v, = 1/n® 


(5.2) Un4i1/Un = (1+1/n)-* =1-—a/n4+ O(1/n?) 
by Newton’s binomial formula. For w,, = 1/n.logn we have 
wn41/tm = [1 —1/(n+ 1)}log(n)/log(n + 1); 


then 


1 1 1 1 
=1 =1 1—1/n+...)=1-—1 O(1/n?), 
reli aie fh 7 (t-i/n ) /n + O(1/n*) 


§ 1. Truncated expansions 203 


logn = logn ue 1 = 
log(nt+1) ~~ logn+log(1+1/n)  1+log(1+1/n)/logn — 
log(1 + 1/n) log(1 + 1/n) \? 
= Joos +0 ( oO 
logn logn 


L/n + O(/n?) | (1m O(1/n?))? _ 
} logn zal logn 7 
= 1-1/n.logn+O(1/n’), 


whence 


(5.3) Wn41/Wn = 1—1/n—1/n.logn + O(1/n?). 


Note that the term 1/n.logn is o(1/n) but not O(1/n?). 
To establish Gauss’ criterion we also need a 


Lemma. Let }> un and >> vy, be two series with positive terms; if 
(5.4) Unti/Un < Vn41/Un for n large 
and if the series > Un converges, then so does the series )~ Un. 


We may assume that (4) holds for any n. On multiplying the first n 
relations we obtain up/ur < Up/vi, whence uy, = O(v,), ged. 

Now we come to Gauss’ criterion. If s > 1 there is an a such that 
1 <a@<_s and on comparing (1) and (2) we see that 


Unt1/Un — Un+1/Un = (8 — a)/n + O(1/n) ~ (s—a)/n, 


so that the left hand side is > 0 for n large. Since the series v, converges for 
a > 1, so does the series un. 

For s < 1, one chooses a between s and 1. The results are the opposite, so 
that if the series }> u, were convergent, so would be the series 5+ v,, absurd 
fora <1. 

If s = 1, the above comparison is unusable because one does not know 
the sign of an expression such as O(1/n?). But using (3) one has 


Un41/Un — Wn41/Wn = 1/n.logn+ O(1/n?) ~ 1/n.logn, 


a result > 0 for n large. Since )> 1/n. log n diverges (Chap. II, n° 12), so does 


Youn; ged. 
Exercise. Show that, for a, b real, 6 not a negative integer, the series 


a2(a+1)?...(a+n)? 
(n + 1)1/2b2(b + 1)?...(b +n)? 


Un = 


converges if and only if b—a> 1/4. 
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6 — The hypergeometric series 
The series 
F(a,b,¢z) = 


a(a+1)...(a+n—1)b(64+1)...(6+n-1) 2” 
+p c(e+1)...(e+n-1) n! 


(6.1) 


I | 


n=1 

= Anz”, 
already found in Euler, plays a much more important role in mechanics, 
astronomy, physics, etc. than in mathematics proper, for the majority of the 
users’ “special functions” are particular cases of it. Moreover, it is probably 
the first series whose convergence was studied correctly — by Gauss who, in 
1813, showed that it converges for |z| < 1, diverges for |z| > 1 and, much less 
easy, examined what happens for |z| = 1. 

First 


(6.2) luing1/tn| = |z].(a + n)(b+n)/(c+n)(n+V)], 
an expression which tends to |z|, whence the radius of convergence. We shall 
therefore suppose |z| = 1 in what follows, and also that a, b, c are real, 
to reduce the difficulties. Clearly we have to eliminate the case where one 
of these parameters is a negative integer since then the series reduces to a 
polynomial or is meaningless. 


(i) We have 

An+41/An = 
= (a+ nj(b-+n)/(e+ n(n +1) = Hee) 
= (1 av =) [1 —e/n+?/n? + O(1/n*)] - 


- [1 —1/n+1/n? + O(1/n?)] = 
(63) =1 c+1l—a-—b | ab —(c+1)(a ey) te+tetl + O(1/n3). 


nm mr 


Since |un41/Un| = 1—s/n+O(1/n?) with s = c+1—a—b, we obtain a first 
result from Gauss’ criterion: 


(6.4) c>a+b <> absolute convergence for |z| = 1. 
(ii) Suppose s < 0; then 1 — s/n > 1 and since 
|Un+1/Un| ~ 1— s/n, 


so is the left hand side for n large, so the u,, increase. Whence 
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(6.5) c<a+b—1 => diverges for |z| = 1. 


(iii) It remains to examine the interval a+b—1<c< a+b, on which 
0 <s <1 and where the series does not converge absolutely. First assume 
s > 0 and write 


Gn41/An = 1— Un with vp, ~ s/n, 


whence 
An+1 = Gp(1 — vp)... (1 — Un) 

for n > p. For n large, v, is > 0 like s and tends to 0, so is < 1, so that for 
p well chosen, the product (1 — v,)...(1 — vn) is positive, decreases when 
n increases and so tends to a limit. Since log(1 — vn) ~ —vUn ~ —s/n, 
the series with negative terms }>log(1 — v,) diverges, which shows that 
lim(1— vp)... (1— vn) = 0 (see the similar arguments on infinite products in 
Chap. IV, n° 17). Consequently, a, tends to 0, decreasing, up to a factor ap. 

For s > 0, the hypergeometric series 5+ a,z” is then decidable by Dirich- 
let’s theorem (Chap. III, n° 11, Theorem 15 or Corollary 1). Consequently 


convergence for |z|} = 1, z 41, 


(6.6) a+b-1<ccatb—{ diveracnce (ory =D) 


Divergence for z = 1 follows from the fact that the series > a, has negative 
terms for n large, so we may apply Gauss’ criterion here with s < 1. 
(iv) If s = 0, then, by (3), 


(6.7) An41/Gn =1+ k/n? + O(1/n*), 


with 


k=ab—(c+1)(a+b)+2?+e+1=(a—-1)(b- 1) 


since c=a+b—1. We have to distinguish three cases. 

If k > 0, the left hand side of (7), which is ~ 1+k/n?, is > 1 for n large. 
Consequently, the series }> a,z” diverges for |z| = 1. 

If k = 0, ie. if @ = 1 (in which case c = b) or if b = 1 (in which case 
c =a), it is clear that the series reduces to )> z”, so diverges for |z| = 1. 

If k < 0 we have |un41/tun| = 1+ vn where vy, ~ k/n? is the general term 
of a convergent series all of whose terms are negative for n large; the infinite 
product of the 1+, is then absolutely convergent, from which it follows that 
|uy| tends to a nonzero limit (Chap. IV, n° 17, Theorem 13), which prevents 
the series from converging. 

To sum up, for a, 6, c real, we obtain the following table, which the reader 
is not asked to memorise: 


at+tb<c absolute convergence for |z| = 1 
a+b-l<ce<atb convergence for |z)=1, z41 


c<a+b-1 divergence for |z| = 1. 
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We said above that the hypergeometric series includes many important 
series as particular cases. The first is the binomial series 


(l+z)° = S > s(s—1)...(s-n+1) 2" = 


= ye s(—s +1)...(-s+n—1)(-z)™ = 
= F(-s,1,1;-z). 


For s real, we have complete results as to the behaviour of the series for 
|z| = 1: 


s>0 absolute convergence on |z| = 1, 
-l<s<0 convergence for |z} = 1, z 41, 
s<-l divergence for |z| = 1. 


7 — Asymptotic study of the equation re” = t 


In this n° we shall detail a most ingenious exercise! whose principal interest, 
at our level, is to make full use of the O and o techniques; as we said above, 
in this domain it is much less useful to learn the general theorems than to 
perform practical work. 

The problem is to study the behaviour as t — +00 of the root x of the 
equation 


(7.1) xe” = t. 


The method consists of first obtaining a very crude estimate of the order 
of magnitude of x as a function of ¢t, then to substitute the result in (1) to 
derive a second more precise estimate, then to substitute the second result 
in (1) to derive a third estimate more precise than the second, and so on. 

First we show that for every t > 0, the equation (1) has a unique solution 
x > 0 and that this is a continuous function of t. For « > 0, the map x + xe* 
is continuous and strictly increasing, zero for « = 0 and it increases indef- 
initely with x: it therefore maps R, onto R, and has a continuous inverse 
map, whence the existence, uniqueness and continuity of x as a function of 
t. 

Since x = 1 for t = e, we see that 


t>e=—ar>1l Se < re =t— x2 < logt, 
whence, for t large (t > e), 0 < loga < loglogt and thus 
log x = O(log log t). 
' taken from N.G. of Bruijn, Asymptotic Methods in Analysis (Gréningen, Nord- 


hoof, 1960). We also advise reading Chap. III of the Calcul infinitésimal of 
J. Dieudonné (Paris, Hermann, 1968), mainly n° 8. 
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But xe* = t implies x = logt — log x whence 
(7.2) x = logt + O(log log t). 


Since log log t = o(logt), we have x ~ logt at infinity. 

Putting log t = y, we have x = y+ O(log y), a result that we can substitute 
into the equation (1), put in the form x = logt — log. To do this we have 
to evaluate 


log x = log[y+ O(log y)] = log{y[1+ O(log y/y)]} = log y+ log[1 + O(log y/y)] 
and since in general log(1 + z) ~ z = O(z) as z — 0, we have 
log x = log y + O(log y/y). 
Since « = logt — log a and y = logt, we now obtain 
(7.3) x = logt — log log t + O(log log t/ log t), 


a more precise result than (2). 
Now we substitute (3) in x = logt — log, all the work being to deduce 
information on log x from (3). Again putting y = log t we have x = y—log y+ 


O(log y/y), i-e. 
z=y(1+z) with z =—logy/y + O(log y/y’). 

Since z tends to 0 we have 
logy + 2—27/2+O0(27) = 
= logy —logy/y + O (log y/y*) — 

= ; [-logy/y + O (log y/y?)]° + O(z%) = 
= logy—logy/y +O (y 7 logy) — 

- sy? log” y + O (y~* log” y) + O (y* log? y) + O(z*). 


I 


log x 


Since z ~ —y~! log y we have O(z3) = O (y~ log’ y) = y~? log y.O (y~! log? y), 
a term negligible with respect to the term O (y-? log y) figuring in the result 
like the other terms in O. Thus in actual fact we have 


1 
log x = logy — logy/y — sy” log? y + O (y~ logy) , 


a relation in which each term is negligible with respect to the preceding. The 
relation « = logt — log x therefore leads to 


(7.4) xz = logt—loglogt+ log logt/logt + 
1 
+ 5 (log log t/ log t)” + O (log log t/ log? ae 
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an estimate again more precise than (3). 
If he continues one stage further the courageous reader will find that 


x = logt—loglogt4 DEE! : : 
logt 2 logt 
loglogt | 1 (loglogt * 3 (log log t)? -O log log t 
log? t 3 ( logt ) 2 log*t ( log? t ) 


8 — Asymptotics of the roots of sinz.logxz = 1 


(de Bruijn, p. 33, exercise 1). On examining the graphs of the functions sin x 
and 1/log a one sees immediately that for every n > 1 the equation has two 
roots between 2nz and (2n + 1)z; one of them, x», lies between 2n7 and 
2n7z + 7/2, the other, y,, between 2na + 7/2 and 2n7 + 7. Let us examine, 
for example, the behaviour of xn. 


Since it is geometrically clear that x, ~ 27n, let us put 
(8.1) Ln = 270+ Un = 2N(1 + vp) 
with 0 < un < 7/2,0 < vpn <1 and 
(8.2) log a, = log(2mn) + log(1 + vn), sin py = sinUn = 1/log ap. 
Now 
1/log(27n). sin un = log(xn) /log(2amn) = 14+ log(1+ upn)/log(2an) = 14+ wn. 


Since 1+ vp, < 2, the third member tends to 1, thus also the first, which 
shows that 


(8.3) Un ~ SiN Un ~ 1/log(27n), 
then that u, ~ 1/log(27n) tends to 0 and that 


(8.4) Un = Un/2an ~ 1/270n. log(2rn). 
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Hence 
Wy = log(1 + un)/log(2mn) = [v, + O (v2)] /log(27n). 
Since w,, tends to 0 we have 


log(27n). sinuy, = 
=1/(1+ wp) =1—w, +O (wz) 
= 1 vp/log(2rn) + O (v2) /log(2mn) + O (v2) /log?(27n) = 
= 1 —vp/log(2mn) + O (v2) /log(2mn), 


whence 
(8.5) — sinuy = 1/log(27n) — vn/log?(27n) + O (v2) / log? (27n). 


But sin up = Un + O (u3) = un + O (1/log?(2mn)) by (3), whence 


l| 


1/log(2mn) — vp / log? (2an) + 

+ O (v2) /log?(27n) + O (1/log®(2mn)) = 
/log(2mn) + O (1/nlog®(20n)) + 

+0 (1/n? log’ (27n)) + O (1/log?(27n)) . 


Un 


I 
= 


On the right hand side the second and the third terms are negligible with 
respect to last, which therefore dominates. At the point where we are now 
we cannot say anything more precise than 


(8.6) Un = 1/log(27n) + O (1/log?(27n)) ; 


which nevertheless improves on (3). 
To get further we write that sin un, = un —u3/6+0 (u>). Using (6) gives 


sin, = Un — [1/log(2mn) +O (1/log?(2nn)]° /6+ O(1/log?(27n)) = 
= Un— [1/ log? (27n) +O (1/ log? (27n)) | /6+0 (1/ log?(27n)) ; 
whence, by (5), 


Un = 1/log(2mn) — vp/log?(2nn) + O (v2) /log?(27n) + 
+ 1/6 log®(2mn) + O (1/log?(27mn)) + O (1/log?(27n)) ; 


since v, = O(1/n.log(27n)) by (4), the terms containing v,, are negligible 
with respect to O (1/ log” (2mn)) and there remains 


(8.7) Un = 1/log(2rn) + 1/6log®(2mn) + O (1/log?(2mn)) , 


which improves (6). Here, as in the preceding n°, the reader can continue the 
calculations and/or examine the behaviour of the other series of roots yy. 
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9 — Kepler’s equation 


We saw in Chap. II, at the end of n° 16, that Kepler’s equation u—e. sin u = wt 
has one and only one root u provided that the eccentricity e of the elliptical 
motion is < 1. To simplify the notation a little, and to avoid confusing e with 
2,71828..., let us write it as 


(9.1) u=pt+esinu. 


Laplace (see the 130-page notice by Gillispie in the supplement to DSB), 
the author of a treatise on celestial mechanics much more advanced than 
Newton’s Principia, proved (?) that u is the sum of a power series in ¢ whose 
coefficients, depending on y, are determined by an extraordinarily simple 
formula: 


(9.2) u=y+e(siny)/1!4 e2(sin? y)’ /2! + e3(sin? vy)” /3!4.... 


One can make (2) plausible by putting u = y+ v, so that v = esinu 
tends to 0 with ¢, and seeking an asymptotic evaluation of v; this does not 
replace a power series, but one works with the tools at one’s disposal. 

Clearly v = O(e) and indeed 


v = e(siny.cosv + cosy.sinv) = 
esiny (1+ O(e7)) + ecos y.O(e) = esiny + O(c’), 


I 


whence 


u=ptesing+ O(c?) =ptesnygt+e*w with w =O(1). 


Then by (1) 
esingte*w = esing(yt+esingt+e*w) = 
= esing.cos (esingy + e?w) +ecosy.sin (esiny + e?w) = 
= esing. [1 — (esing + ew)? /2+ o(e')| + 
+ ecosy. fesiny + e?w + O(e%)] , 
whence 


e’w = e*sinycosy + €° (weosy — sin® y/2) + O(e*) 


Le. 

w = sin pcos y + € (weosy — sin® y/2) + O(e?). 
Since w = O(1), i.e. is bounded, one infers immediately that w = sin y cos p+ 
O(e), whence, substituting in the right hand side of the preceding relation, 


w = sin pcos y + € (sing. cos” y — sin® y/2) + O(e7). 
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Thus 
u=ytesing +e*singcosy + €° (sing. cos” y — sin? y/2) + O(e*), 


which yields the first terms of formula (2). The calculation becomes more 
and more painful as one pushes further and further. 

The result was then extended by Lagrange to much more general equa- 
tions, namely, in his notation, 


(9.3) z=ax2+yf(z). 


Since, for these Gentlemen, all functions arising in Nature, or even in math- 
ematics, are analytic outside isolated points — as it happens, they were right 
to believe so if f is, but then to prove this ... —, Lagrange set himself to 
calculate the expansion of z = )>a,y” as a power series, where the coeffi- 
cients a, of course depend on x. The direct way would be, for x given, to 
differentiate (3) indefinitely with respect to y, and to deduce the relations 


a= flz)tyfi(z2, 2 = 2fi'(a)’ ty [f"(@)2?7 + fez"), 


Zl" = Jf" (aZ\2" +2f'(z)2" +f" (z)z? + fi(z)2" + 
a y eae ae Df (eye fe fia af Fee" | ; 


etc. and to find successively their values for y = 0: 
r 


20)=2, (0)=f(x), 20) =2f"(a) f(z) = [F()7] , 
2"(0) = 3f" (a) f (2)? + 6f' (2) F(a) = [F(@)*]", 
etc. This is, in an other form, what we have done above for Kepler’s equation. 
The first results suggest the formula 


(9.4) z=at yoy" [fey /n! 


using Maclaurin. But one falls rapidly, as above, into impossible calculations. 
Lagrange’s method of establishing (4), at least formally, is considerably more 
ingenious. 

His idea was to consider z as a function of y and of x and to differentiate 
(3) with respect to each of these two variables in order to calculate the 
coefficients D}z(x,0) of the Maclaurin series of z with respect to y. 

To start with, thanks to the Chain Rule (Chap. III, n° 21), one finds 


(9.5) Diz=1+yf'(z)Diz, Doz = f(z) + yf’ (z)Daz, 
whence (Diz — 1)D2z = [Doz — f(z)] Diz, and consequently 


(9.6) Doz = f(z)Dy2z. 
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For y = 0, (5) gives 


(9.7) D,2z(x,0) = 1, D2z(x,0) = f(x) 


since then z = x. Differentiating (6) with respect to y, 


I 


Dez f'(z)DezDyz + f(z)DeDiz = f'(z)DezD 24+ f(z)Di Dez = 
= dD, [Doz.f(z)| = Di [Diz.f(z)?] 

by (6) and D, Dz = DD, (Chap. IH, n° 23). Whence D3z(x,0) = [F(x)?] 

since, for y = 0, we have Dgz.f(z) = f(x)? by (7). Suppose we have proved 

that 


(9.8) Die SD Die fe)" | 
for any x and y, and differentiate. Using (6) again, and D,D2 = D2D,, we 
get 
DES. = DE Ds Dee |= 
= Dit [DoDyz.f(z)" + Diz.nf(z)”"'f' (2) Dez] = 
= Dt * [Di Doz.f(z)” + Doz.nf(z)”""f'(2)Diz] = 
= DP [Dez.f(2)"] = D? [Diz-fl""] , 


which is (8) for n+ 1. 
The relation (8) is therefore valid for any n and yields the formula 


DR 2(x,0) = [f(2)"]°"? 


which justifies (4), at least formally. 

In fact, Lagrange went even further; instead of just expanding z he ex- 
panded an “arbitrary” function of z, say u = y(z). Since Diu = y'(z)Diz 
and Dou = y'(z)D2z, the relation (6) becomes 


Dou = f(z)Diu, 
which allows one to calculate as we have just done, this time with 
n ni(n-1 
(9.9) D3 u(e, 0) = [e'(a)f(@)""? 


and “thus” 


g(z) = ola) + oy" [w'(a) Fey"? fn. 


The proof consists of establishing the relation 
Dyu = D{~" [Diu.f(z)"], 


which replaces (8), by induction as above; for y = 0 we have Diu = 
gy’ (z)Diz = y’ (x), whence (9). 
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10 — Asymptotics of the Bessel functions 
Consider the differential equation 
(10.1) p+ (1-¢/F )¢=0, 


where x is an unknown function of the real variable t 0 and c is a nonzero 
constant. We set ourselves to study the asymptotic behaviour of its solutions 
for t large. We shall divide this relatively difficult but highly instructive ex- 
ercise into several parts. 


Passage to an integral equation 


Since (1) can be written 


(10.1’) "+2 =cz/t?, 


one may assume that at infinity its solutions resemble those of the much 
simpler equation y” +y = 0, which has as its solutions at least (and, we shall 
see, at most) the functions 


y(t) = ae” + be~*, whence y'(t) = iae”’ — ibe~"*, 
where a and 6 are arbitrary constants. In the general case one puts 
(10.2) a(t) = a(t)e’ + b(t)e™, x’ (t) = ia(t)e"’ — ib(t)e~* 


where a(t) and b(t) are now functions that one may easily calculate from x 
and x’, by multiplying the relations (2) by e’’ or e~**. This is the method 
of variation of constants (Johann Bernoulli, end of the XVII" century, for 
equations of the first order, Lagrange in the general case) which applies to 
all differential equations in which the unknown function and its derivatives 
occur linearly, but which one applies here in a nonclassical way since one is 
making believe that the function cx(t)t~? occurring on the right hand side 
of (1’) is known [if it were the method would provide all the solutions of (1’) 
in terms of integrals involving the right hand side]. 

The second relation (2), which seems to contradict the Chain Rule grossly, 
is in fact equivalent to 


(10.3) a’(t)e"’ + b'(t)e™™* =0. 


It then follows that 


" at be? ia’ e* ib'e7** =p + ia’ e* _ ib'e~ 
by the relations (2). Equation (1) can therefore be written as 


(10.4) ia’ (t)e"* — ib! (t)e~"* = cx(t)/t?. 
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One deduces from (3) and (4) that 
(10.5) 2ia' (t) = ct~?x(t)e, 2ib! (t) = —ct~*x(t)e, 


whence, using the FT, 


t 
2ia(t) —2ia(to) = cf x(uje udu, 
to 
t . 
2ib(t) — 2ib(to) = - | a(u)eu*du. 
to 


We must assume to # 0 and ¢ of the same sign as tg because of the factor 
u-?, not integrable on a neighbourhood of 0. We shall assume them > 0 in 
all that follows, the opposite case being treated similarly. Substituting in the 


first relation (2), one finds 


t 
(10.6) x(t) = po(t) + ef a(u) sin(t — u)u~*du 
to 

with po(t) = age’ + boe~, where ao = a(to), bo = b(to). Instead of, like (1), 
involving the function x and its derivatives, (6) involves x and an integral 
featuring the function «x itself; this is an integral equation. It does not even 
assume a to be differentiable: the continuity of x is enough for (6) to make 
sense. It is (6) which will allow us to examine the behaviour of x at infinity. 

One may conversely verify that every continuous solution of (6) is in fact 
C@ and satisfies (1). The theorem on differentiation under the [ sign with 
variable limits (Chap. V, n° 12, Theorem 13) in fact shows that the right 
hand side is differentiable and that 


(10.7) x'(t) = ph(t) te / ONC Caen tae 


to 


since the function integrated in (6) is zero for u = t. This relation in its turn 
shows that x’ is differentiable and that 


(10.8) x(t) = po(t) - cf a(u) sin(t — u)u~*du + cx(t)t~? 


to 


since x(u) cos(t — u)u~? = x(t)t~? for u = t. Since po + pj = 0, one finds (1) 
again, on adding the result to (6). The fact that « is C™ is then obvious, 
either because the function integrated in (6) has continuous derivatives of 
arbitrary order with respect to t, or because the differential equation shows 
that if 2 is C?, then it is automatically C?*? away from the origin. 


First bound for the solutions 
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Let us agree provisionally that there is a solution of (6) defined for t > 0 
and then show it is bounded at infinity. Indeed, let M(t) be the upper bound 
of |a(u)| on the interval [to,t] and Mo that of |po(u)| in R, clearly finite. 
(6) shows that, for to < t’ < t, 


t! 
|a(t’)| < Mo + ime f u-?du < Mo + |e|M(t)/to, 
0 


whence, passing to the sup, 
(10.9) M(t) < Mo + |ce|M(t)/to. 


If we have chosen to large enough that |c|/to < 4 we may deduce that M(t) < 
2M, qed. 
Now let us show that there exist constants a1, b, such that 


(10.10) a(t) = aye’ + bye" + O(1/t) = pi (t) + O(1/t). 


Since x(t) is indeed bounded, the integral in (6), taken from tp to +0, is 
absolutely convergent like that of the function 1/u?. Thus 


+00 
(10.11) x(t) = po(t)+ ef x(u) sin(t — u)u~*du — 


to 


+00 
- of a(u) sin(t — u)u du. 
t 


Expressing sin(t — u) in terms of complex exponentials, we see that the first 
integral is, like po(t), a linear combination of e’* and e~ with coefficients 
independent of t, whence 


+oo 
(10.12) x(t) = p(t) — cf a(u) sin(t — u)u~*du 
t 

where p;(t) is a linear combination of e’ and e~™ with constant coefficients. 
Since the function x(w) sin(t — u) is bounded for u > tp > 0, the integral is, 
up to a constant factor, majorised by that of u~?, i.e. by 1/t, qed. 

The relation (12) allows us to complete the existence theorem for solutions 
— we shall prove it below — with a uniqueness theorem: there exists only one 
solution of (12) for p, given. Since p; depends on two arbitrary constants, 
this means that the set of solutions is a vector space of dimension 2 over C. 

On subtraction we reduce to proving that, if p; is zero, then so likewise 
is x. But denote now by M(t) the upper bound of |a(wu)| for u >t. For t’ > t 
we clearly have 

|x(t’)| < |e|M(t)/t 


since the integral of u~? between t’ and +00, equal to 1/t’, is < 1/t. Whence, 
on passing to the sup, M(t) < |c|M(¢#)/t. Substituting this result in the 
integral equation, we now find 
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+00 
jee) <e2te) fwd, 
t/ 
whence M(t) < |c|?M(t)/2t?. Substituting again in the equation, we will 
find M(t) < |e/3M(t)/3!%, etc. In short, M(t) < M(t)|c/t|"/n! for any n. 
Since, for t ¥ 0 given, the right hand side tends to 0 when n — +00, we find 
M(t) =0 for any t > 0, whence x(t) = 0, qed. 


Existence of solutions 


To go further than (10) in studying «(¢) at infinity, one might, as always, 
iterate the calculation, i.e. substitute (10) in (12) and so on indefinitely. 
We are going to adopt a slightly different method which will at the same 
time show the existence of the solutions; this is the method of successive 
approximations, which consists of extending the method of constructing the 
roots of an equation z = f(x) expounded in Chap. II, n° 16, Theorem 12 and 
in Chap. ITI, n° 24 (implicit functions) to integral equations; it can be used to 
show the existence, at least locally, of the solutions of almost all reasonable 
differential or integral equations. Since all the integrals which now appear 
are extended over [t,-+oo], we shall adopt the simplified notation 


of 


up to the end of this n°, clearly not confusing this with an indefinite integral 
a la Leibniz. In this notation 


pm =t"/n 


for n > 1 by the FT. 

The method of successive approximations consists of starting from the 
function p,(t) in (12), to which x(t) is equal up to the addition of a O(1/t) 
term, constructing a sequence of functions x,(t) on t > 0 by putting v1 = pi 
and 


(10.13) Ln4il(t) = pi(t) — cf en(u) sin(t — u)u- du, 
and showing that the x, converge to a solution of (6). 
For n= 1 


|vo(t) — x1 (t)| < Mc} [tau = M|c/t| 
where M = ||pi||p < +00. It follows that 


|x3(t) — xo(t)| = fel. / [a2(u) — 21(u)] sin(t — u)u7*dul < 


IA 


Mle? f w-Sdu = Mlc/t?/2 
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If one has proved that 


(10.14) |tn(t) — tn_1(t)| < M|c/t|"-4/(n — 1)! 
one finds 
ltngil(t) —an(t)| = |e. / [tn (u) — &n—1(u)] sin(t — u)u~*dul < 


IA 


M{e|"/(n = yt fw tau = M|c/t|"/nl, 


which shows in passing that 
(10.15) Ln4i(t) = Ly (t) + O(t-”) 


at infinity. Since the series )> [w~n+1(t) — vn(t)] is, by (14), dominated by the 
series exp(M|c|/t), it converges normally on every interval t > to > 0, so 
that x(t) converges to a limit x(t) for every t > 0, and does so uniformly 
on every interval [to,+oo[. One may then pass to the limit under the [ sign 
in (13) because of the presence of the integrable? factor u~?. It is then clear 
that x(t) satisfies (6). 

Further, by (14), 


lz(t)—an(t)| < Do lentptr(t) — ent p()l < 


p20 
< MY le/t\"*? /(nt+p)!< MY |c/t|"*? /nlp! = 
p20 p>o0 


= M.exp(|c/t])|¢/t|"/n!, 
a result which implies 


(10.16) a(t) = ¢,(t) +O”) 


at infinity since the factor exp(|c/t|) tends to 1. 
Exercise. Let I be a compact interval, p a continuous function on J and 
K(t,u) a continuous function on I x I. Put 


M =sup [ |K(t,u)|du. 
tel 


Show that, if M < 1, the integral equation 


? Let I be an arbitrary interval, p(x) an absolutely integrable function on J, 
and (fn) a sequence of bounded functions which converges uniformly on I to 
a limit f, clearly bounded. The functions f,~ and fu, majorised up to con- 
stant factors by j, are then integrable and one has |f [fn(u) — f(u)] u(u)dul < 


| fn — fll. f |u(w)|du, whence lim f fn(u)u(u)du = f f(u)u(u)du. Cf. Chap. V, 
n° 31, Example 1. 
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x(t) = p(t) + [| Kewetu au 


has one and only one solution (the integrals are over I). Analogy with a sys- 
tem of linear equations? 


Asymptotics of the solutions: general form 


It is clear that in order to obtain asymptotic evaluations of x(t) one should 
seek them for the xz,(t). To simplify the calculations we shall assume that 
we are in the case where the “trigonometric binomial” p(t) in (10) reduces 
to e’'; the case where it is equal to e~” is treated in the same way (the two 
solutions are even complex conjugates if c € R), and in the general case it is 
clear that x is a linear combination of the functions corresponding to these 
two particular cases. 

Since x(t) = e”, the relation (13) shows that 


(10.17) Qite(t) = 2ie™* — dic | e’ sin(t — u)u~*du = 

ne. WHEE e ef (cit — e2#¥- #8) udu = 

= ie —ce™/t+ce* / eu du. 
Here we meet, and we will meet again, an integral of the form f e?“u~?du 
extended over [t, +oo|. By repeatedly integrating by parts it is easy to find a 
truncated expansion of arbitrarily high order when ¢ tends to infinity. Gen- 


erally, if Re(a@) < 0 to ensure the convergence of the integrals, one has, using 
exceptionally the [ sign 4 la Leibniz, 


= Pp —p-1 
pew Pdu = e™/auP+— | “uP du= 


a 


1 
= eau? + pe /aruPt! + ppt t) 2 femur tau 
a 


etc?. When one integrates from ¢ to +00, the integrated-out parts, zero at 
infinity, yield the product of e/t? by a polynomial in 1/t; the integral of 
ey-N~?, majorised by that of u-'~?, is O (t~4~1). One deduces from this 
that, for any N > p, there is a relation 


3 Note that instead of trying to pass from an integral in u~? to an integral in u?*1 
as was done in Chap. V, n° 15, Example 2 in the illusory hope of calculating 
a primitive explicitly, here one passes from an integral in u~” to integrals in 
u?-1, uP? etc. whose order of magnitude one may evaluate, even if unable 
to calculate them explicitly. 
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+oo 
(10.18) e® if eMuPdu = el (2/tP+2/4PT1 +... 42/0N) + 
t 
+0 (t-N71) 


with coefficients ? which depend on p, but not on N; the reader may calculate 
them: we don’t need them now. Returning to (17), the case where p = 2, we 
finally find 


(10.19)  2éaro(t) = ie’ +e” (7/t4+7/t? +...42/t%) +0 (tN) 


for any N. 
It is the same for «,,(t) for any n. One shows this by induction using (13): 


241 (t) — 
= Qie* — 2ic | @,(u) sin(t — u)u~2du = 


= 2ie* — ef e™ [?-2/u—...-?/u% +O (u-N~1)] isin(t — u)u27du = 
=2e#+e D> 7 7 f (e- eke Bae dur [O(u aa. 
0<p<N 


The integrals [ e’’w~?~?du yield the product of e by a polynomial in 
1/t without constant term. The integrals [ e?’“~“u-?~*du likewise by (18) 
have truncated expansions of arbitrarily high order. Finally, the integral 
JO (u-X~3) du is O (t-%~?). So for every n and every N there is a relation 
of the form 


(10.20) Un(t) =e (14?/t42/P? +...42/tN) 40 (N74). 
In view of (16) one obtains an expansion 
(10.21) x(t) =e” (1+ai/t+a2/t?+...+an/t%) +O (t-%) 


for x(t), whose coefficients do not depend on N; for if one has this for e~*‘x(t) 
or for every other function with truncated expansions of order 12 and 15, the 
second, with its terms of degree > 12 removed, yields a truncated expansion 
of order 12; now a given function can have only one truncated expansion of 
given order, as we saw in n° 3; the two expansions must therefore have the 
same terms of degree < 12. 

One sometimes expresses this fact by writing (21) in the form of an as- 


ymptotic series 
Co 


(10.22) easy aa /t% 


n=0 
this way of writing by no means states that the series on the right hand side 
represents the function considered: in almost every case of this kind, including 
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the one which now occupies us, as we shall see when all will be calculated, the 
series is divergent. The form (22) is, by definition, equivalent to the fact that 
the relation (21) is valid for any N. In other words, the difference between 
the left hand side of (22) and the N-th partial sum of the second member, 
instead of tending to 0 for t given when N increases, tends to 0 for N given 
when t increases, and as rapidly as the first term neglected; a nuance not to 
be forgot ... 

This is for example what happens on a neighbourhood of t = 0 when one 
writes the Maclaurin formula for a function x(t) which is not analytic but is 
of class C'°. For any N, one has 


x(t) =2(0)+2'(0)t+...¢2 (OY /MI +O"), 


in other words 
x(t) & So a0)? /nl, 


but the series has no reason to represent the function if it converges — the 
case of exp(—1/t?) — and even less if it diverges, which is the general case 
since the derivatives at the origin can be chosen arbitrarily (Chap. V, n° 29). 


Term-by-term differentiation of asymptotic expansions 


The problem now arises of calculating the coefficients a, in the expansion 
(22) explicitly, preferably without drowning oneself in calculation. In doing 
this in a more explicit way than we have done above we could find the recur- 
rence relations allowing us to calculate the a,. A more elegant* and above 
all more instructive, method, consists of showing that on differentiating (22) 
term-by-term, one obtains the analogous asymptotic expansions for 2’ and 
x"; on substituting into the differential equation (1) one will find the needed 
recurrence relations immediately. 

It is not at all obvious, and it is generally false, that one can deduce 
an asymptotic series for the derivative from the asymptotic series of a given 
function by differentiating term-by-term. The derivative of a function O(t") 
at infinity (r € R) has no reason to be O (t"~): the function x(t) = sin(¢?) /t 
is O(1/t) at infinity, but its derivative x(t) = 2cos(t”) — sin(t?)/t? is O(1) 
and not O(1/t?) at infinity. This is the problem we have already met in 
connection with differentiating term-by-term the sum of a series of differen- 
tiable functions: one may, thanks to FT, majorise a function starting from a 
majoration of its derivative, but the inverse operation is impossible. 


* One of the Goncourt brothers, famous literary critics of the XIX" century, relates 
in his Journal that during the reception of a new immortal, X, into the Académie 
francaise, the academician Y charged with delivering the eulogy on X had the 
regrettable idea of describing the oratorical style of X as elegant. The latter, 
furious, stood up and replied: Elegant yourself, Sir! (I quote from memory). 
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The reality is that here, as in the case of a convergent series or sequence, 
it is the existence of an asymptotic series for the derivative which enables 
us to obtain one for the function itself. For assume that the derivative of a 
function f(t) has an expansion 


(10.23) f'(t) = a9 +a, /t+...+an4i/t%** +0 (1/t%*?) 


at infinity. One cannot integrate it from ¢ to +00 because of the first two 
terms, but one reduces to the case where they are zero on replacing f(t) by 
g(t) = f(t) — aot — a, logt. Then g'(t) = O(t~?), the derivative is integrable 
from t to +00, the function g tends to a finite limit g(+oo) when t — +00 
and, by the FT extended to the interval [t,-+oo[ (by passage to the limit), 


g(+o0) — g(t) = f of udu = aa/t+ 09/20 +...+an41/Nt% +0 (1/t%*") 


since the integral from t to +00 of an O(u~*) function is majorised up to a 
constant factor by that of u~*. Returning to the original function f(t), the 
relation (23) implies 


(10.24) f) = apt+a,logt+b-—ao/t — a3/2t? —...— 
— an4i/Nt% + 0 (1/t%*") 


for any N, with an inevitable constant b, since knowing f’ determines f 
only up to a constant. It is then clear, on comparing (23) and (24), that the 
asymptotic expansion of f’ is obtained by differentiating that of f term-by- 
term. 

Returning to the function x(t) which concerns us, we must show directly 
that x’(t) and 2” (t) have asymptotic expansions analogous to (22). To do 
this, let us again consider the integral equation (10.12) 


+00 
a(t) =e" — a ax(u) sin(t — u)u*du 


and apply to it the formula of differentiation under the [ sign with variable 
limits in the case of an infinite interval, namely 


a a +00 
dt he fears | Pilea ele 


(Chap. V, n° 12, Theorem 13, which extends immediately to the case of an 
infinite interval using? n° 25, Theorem 24, of the same Chap. V). The latter 
assumes that, when ¢ remains in a compact set the function D,f(t,u) is 
majorised by a fixed integrable function of u. Now, in the case of (12), 


° One writes that the integral of y(t) to +00 is the difference between the integrals 
from a to +00 and from a to y(t) for a fixed a. 
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|Di f(t, u)| = |x(u) cos(t — u)u~?| < Mu? 


since the function x(u) is bounded on every interval t > to > 0; there is 
no problem. The function f(t, u) which we are integrating here from ¢ to 
+oo vanishes for u = y(t), the wholly integrated part of the differentiation 
formula disappears, and there remains 


(10.25) z(t) =ie" = ef ax(u) cos(t — u)u~*du 


where one integrates from t to +co; compare to (7) and (8). 
Exercise. By differentiating the recurrence relation between the x,,(¢) 
show that 


af a(t) + tn4i(t) = aay (tt? 


and that, for every r, the derivatives al” (t) of the xz, converge to x ")(t) 


uniformly on t > to > 0. 
On substituting (21) in (25), one has 


a(t) = 


= ie’ c| [2 tay/t+ao/t? +...+an/t® 4 O(N’) e’ cos(t — u)u- du 


= ie cf (1 tay/t+a2/t?+...4 an/t™) ei cos(t — u)u-"du +O (t-N-?) . 


Arguing as above — replacing the sinus by a cosinus clearly changes the 
method not at all —, one obtains an asymptotic series 


(10.26) e tale)” bf 


similar to (22). As for x(t) = —(1 — c/t?)a(t), one obtains an asymptotic 
series for it directly starting from that for x(t). 


Coefficients of the asymptotic expansion 


Let us now put e~“'x(t) = y(t). We have y(t) © > ant~” by (22), and on 
the other hand we know that the derivatives 


(10.27) y/(t) =e" [a"(t) — ia(t)], y(t) =e" [2"() — 2ia'(t) — 2) 


also have asymptotic expansions of the same type. They too can be derived 
from the expansion of y(t) by differentiating the latter term-by-term as for 
a power series in 1/t. [This means that the expansion of x(t) too can be 
derived from that of x(t) by differentiating term-by-term, not forgetting to 
differentiate the factors e“’]. The expansions of y’ and y” must thus be 
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y(t) © y —na,t *—, y(t) Se. n(n + 1)ant~"~?. 


Now let us exploit the differential equation we started from. Since we put 
x(t) = ey(t) we have x(t) + a(t) = e”[y"(t) + 2iy’(t)], whence ct~2y = 
y” + 2iy’. Thus 


(aot? + ayt? + agt + +...) 
& (2.layt7? + 3.2agt~* +...) — 2i (ayt~? + 2agt? + 3agt4+...). 


Because the asymptotic series of a given function is unique it is legitimate to 
calculate as for a formal series. Since aj = 1 we find 


—2iay=c, —2iag = (c—1.2)a,/2, —2ia3 = (c — 2.3)a2/3 


and generally 
An /An—1 = tfc — n(n — 1)]/2n; 


one deduces ay, by multiplying together the first n relations. 

Note that the ratio |a,+41/an| tends to +oo. The radius of convergence 
of the power series }> a,,z” is therefore zero, which confirms that the expan- 
sion as an asymptotic series x(t) © e’* S>a,t—” is the exact opposite of an 
expansion as a convergent series. 

Exercise. Show that the differential equation (1) is satisfied by convergent 
series of the form t® }7,,..9 @nt”, with a non integer exponent a and coefficients 
to be determined. 

The method used here in the case of the Bessel equation has given rise 
to an ocean of literature concerning either other special functions, or general 
linear differential equations; Chapters XIV and XV of Dieudonné, Calcul 
infinitésimal, give a faint glimpse of the general case and of the theory of the 
Bessel functions, about which voluminous treatises have been written. 

The best classical reference on these and the other “special functions” 
is the “Bateman Project”, Higher Transcendental Functions (McGraw Hill, 
1953-1955, 3 vols). To understand the subject, it would be better to read 
N. Vilenkin, Special functions and the theory of group representations (Amer- 
ican Mathematical Society, 1968) (translated from original Russian edition 
(Moscow, 1965)), which is based on ideas which are totally foreign to the 
“experts” on the classical theory and of much more general scope than these 
(harmonic analysis on non commutative Lie groups). Since they are well 
above the level of this book, there is no point in citing more recent and 
inaccessible references. 


224 VI -— Asymptotic Analysis 


§ 2. Summation formulae 


11 — Cavalieri and the sums 1* + 2% +....+ n* 


In Chapter IT, n° 11 we gave a very effective direct method for integrating the 
function x* when k is a positive integer: one divides the interval of integra- 
tion [a,b] by points ag” forming a geometric progression, with q = (b/a)/”, 
and lets n tend to infinity. But the first mathematicians to perform this cal- 
culation proceeded in another way: like Archimedes in the case k = 2, they 
used a subdivision of [a,b] by the points of an arithmetic progression. In the 
simple case where the interval of integration is of the form [0,a] one puts 
q = a/n and uses the points q,2q,...ng; the integral sought is then clearly 
the limit of the sums 


Ce on = [a*+(2q)*+...+(ng)¥] a/n = 
= (1% +2* +...4n*) ak*t /nk*, 


For & = 1 it had been known for a long time that 


(11.2) 14+2+...t¢n=n(n+1)/2=n?/2+n/2, 


whence 01 = a?n(n + 1)/2n?, an expression which tends to a?/2. For n= 2, 


the case treated by Archimedes, who already knew that 
(11.3) 1? +...+n? =n(n+1)(Qn4+1)/6 = 3/3 +n?/2+n/6, 


one has 02 = a? (1/3 + 1/2n + 1/6n?), which tends to a?/3. 
The Italian Cavalieri studied the case where k = 4 around 1630, using 
the formula 


(11.4) B4...4n3 =n? (n41) /4=n4/44+n3/24n7/4, 


which gave him the value a*/4 for the integral. Around 1646 he extended the 
calculations up to k = 9, with the help of the formula 


(11.5) 19+...--n® = n?/10 + 9/2 + 38n®/4 — 7n® /10 + n4/2 — 3n?/20. 


These calculations are all the more praiseworthy than the modern mechanism 
of algebra, with its condensed notation was then strongly in flux. John Wallis 
set out the method in his Arithmetica Infinitorum of 1656, but no one was 
yet able to find the formula (5) corresponding to an arbitrary value of the 
exponent k. 

Fermat, who did not publish, took up the problem around 1636 — it was 
his idea to use a geometric progression —, but instead of trying to calculate 
1* +...+n* exactly he was content to find an approximate value adequate 
to solve the problem, namely 
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(11.6): TP bo ipe  S aPP ha 
< W+...tn® < (n4+1)*1/(k4+1). 


This shows that up to a factor a*+! the Riemann sum (1) lies between the 


products of 1/(k +1) by 1 and (n +.1)**! /n**, which tend to 1. 
(6) is proved by induction on n > 2. The case n = 2 is obvious. If (6) has 
been proved for an integer n it follows that 


nett k+1 


k 
q 
et Oe ey 


P+ tnk < +(n+1)¥ <1F¥+...4+(n +1). 


It is therefore enough to show that 


k+1 y)k+1 k+1 
eS yh 2 ee ae eae 
k+1 k+1 k+1 


and then, putting « = 1/n, that 
L+(k+l)e<(lta)*th <14(k+1)(14+2)*. 


Since z > 0 the binomial formula proves the first inequality. The second can 
be written as 


A (k+1 Kk 
1+d>( Jers 144d (Slam 
payee p 


p=0 


and reduces to the inequality 


k+1 k+1(k k 
oa) a PP) 
pal DARA p 
between binomial coefficients. 
Ezercise®. (a) Prove the equalities 


Ge: a 1+ 24...4n= 5n(n+1) 
S2os= 12742? +...4n? =n(n+1)(2n4+1)/6 
2 
SoS n= | 5n(n+1)| =(142+...4n). 
(b) For 


SP :=1P4+2P+...4n?, 
establish the identity 


° Walter, Analysis I, p. 36. The signs := mean that the expression which follows 
the sign = is the definition of that which precedes the sign :. 
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1 
(p+1)SP + . \Stt 4... 480 = (ne = 


discovered by Pascal in 1654. 
(c) Show that for every p > 1 there exist p real numbers cj,...,¢p such 
that 
SP =nPt  /(pt+1) +n? /2t+eanP 1 +...+cp-int ep. 


Hints: 
1 1 
(+1 —ar = (PT Jere (5 Jord] (p,n EN, n> 1). 


Add these equations term-by-term for « = 1,2,...,n. The assertions (a) and 
(c) can be proved by induction or from Pascal’s identity. 


12 — Jakob Bernoulli 


In 1713, in his Ars Conjectandi, the most famous, if not the first, of the 
treatises on the calculus of probabilities, Jakob Bernoulli published — rather, 
it was published for him, for he died in 1705 before finishing his book — the 
general method which allows one, for k € N, to express 1* + ...+n* asa 
polynomial of degree k+1 in n. He calculated the first sums afresh and noted 
in passing that he had been able to calculate “in less than half a quarter hour” 
that 


149+ ...4+1000'° = 91 409 924 241 424 243 424 241 924 242 500. 


Exercise. If the human species had had thirteen fingers instead of ten, 
Bernoulli would have had to calculate the sum 1° + ... + 2197!%. Find the 
result in less than half an hour using numeration to base 13. 

His general method’ was to start from the relation 


(12.1) (1) a is (. : :) 2S = i) 


between the binomial coefficients (which he wrote explicitly, as everyone did 
then); one may prove this easily by induction on n, writing that 


n n—1 wo1\ - fil aa a 
Ce her) ey) 
For k = 3, one thus finds 
T See Vol. III of Moritz Cantor, pp. 343-347. 
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I 


(12.2) n(n — 1)(n — 2)/3! S(p-V(p— 2)/2! = 


= 5° (p?/2-3p/2+1), 


which, using the formulae for the exponents k = 0 and 1, yields the formula 
for k = 2. The formula (1) for k = 4 then allows one to calculate the $7 p? 
from the formulae already obtained, and so on. 

But Bernoulli went much further. He stated that 


(12.3) yop" =n*1/(k+1)+n*/2+ 5 Ant 
1 


KE U(k=2) pes, REV (R= * Gas 


2.3.4 6! rae 
with coefficients A, B, C, ... independent of k, and exponents k — 1, k — 3, 
k — 5,... The two first terms were obvious because he knew the explicit 


formulae for k < 10, but no one knows how he divined the relation (3) from 
them, which he merely stated after a list of explicit formulae. Of course, if 
one accepts (3), the first formulae easily give the values 


A=1/6, B= -1/30, C=1/42, D=-1/30, E=5/66, etc. 


A much less magical method is to postulate that, in conformity with the 
first formulae, one has® 


(12.4) 4... tn* = Agii(n) 


with Ai(2) = x for k = 0 and, for k > 1, a polynomial of degree k + 1 
without constant term, then to establish those properties of these conjectured 
polynomials which allow one to calculate them “without calculations” and, 
to finish, to verify that they satisfy (4). 

To start with, the relation 


n® = (1k +...+n%) — (1F +...4(n—1)*) 
implies the polynomial identity 
(12.5) Apyi() — Apyi(x —1) = 2* 


since the difference of the two sides is a polynomial which vanishes at every 
x EN. This relation already determines the A, up to additive constants, for 
the difference between two solutions is a polynomial of period 1, so constant; 


8 Bernoulli uses the notation S n® = 1* +...+n*, which, once again, violates all 
the tabus concerning phantom and free variables. See Hairer and Wanner, p. 15, 
for a photographic reproduction of Bernoulli’s table of the first ten formulae. 
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if one then assumes A;,(0) = 0 for & > 1 then the A, are entirely determined 
by (5). Now in deriving (5) one sees that Aj,, ,(7)/k satisfies it for k — 1. If 
one writes aj, for the constant term of Aj, ,(x)/k, then 


(12.6) Aj(t)=a+a, Akyy(e)/k = Ap(x) +04 for k > 2. 


The A, having zero constant term, one obtains step-by-step, by straightfor- 
ward calculation of the successive primitives, 


A, (2) z, 

Ao(z) = 27/2+a,2, 

A3(t) = 2°/3+a,27 + 2agz, 

Ag(xz) = 2*/44 a,x? + 3agz2? + 3agz, 

A5(x) 2° /5 +127 + 4agx? + Gaga? + 4aga; 


k-1 
(12.7) Ax(x) = a* /k + > (es 1 ane 


now obvious, is just (3) with 
a, =1/2, ag=A/2, ag=0, as=B/4, as =0, ag =C/6, 


etc. 

Though not immediately providing the numerical values of the a), at 
least (6) proves the existence of a relation (7) with the same coefficients a, 
for all the formulae. This, Moritz Cantor calls it Jakob Bernoulli’s “idea of 
genius”, seems relatively humdrum to me, even for the period; for if there 
was anything they knew how to do, it was to calculate the derivatives or 
primitives of polynomials in x... 

To obtain the numerical values of the coefficients one uses a remark which, 
here again, was surely within the scope of the genial inventor: by (5) one must 
have 


(12.8) ASAIO Tor RS? 
and so A,(—1) = 0. Whence, by (7), a relation? 
k-1 k-1 
(12.9) ik ar + ( 1 )aa~ ( 5 Jaret 
1f[k-1 
+ (14 (Fj Jaw =0 (22) 


° Bernoulli clearly knew this, for he wrote, without proof, that to calculate the 
coefficients in his first ten formulae one uses the fact that A,(1) = 1; he details 
the calculation for Ag. See the text in Walter, Analysis I, pp. 162-163. 
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which enables one to calculate the coefficients step-by-step. 
It remains to show that with this choice of the a; the A; do indeed satisfy 
(4). Since 
(12.10) Agsa(n) = [Agsa(n) — Aesa(n — 1)]+ 
T [Apsi(n , 1) a Axg+i(n — 2)] a 
+ [An+i(1) — An+1(0)] + Ax41(0) 


and since Ay,+1(0) = 0, (4) will in fact be a consequence of (5), clearly true 
for k = 0. We shall prove (5) by induction on k. 

First, by the simplest of the relations between binomial coefficients, 
(7) defines polynomials satisfying (6) for any ap. If one has already ver- 
ified that A,(x) — Ax(a — 1) = x*-! then the formula (6) shows that 
Ai,4() — Ai, j(@ — 1) = ka*~!, whence Ag4i(x) — Agyi(z — 1) = 2* up to 
an additive constant. This must be zero for k = 0 since Aj(x) = a. It is zero 
for k > 1 because the choice (9) of the a, is equivalent to A,(0) = A,(—1) 
and shows that (5) is valid for = 0. Hence (5), and consequently (4) for 
any k. 


Posterity has preferred, for reasons which will appear later, to use the 
polynomials B,(x), k > 0, of degree k, possibly with nonzero constant terms 


(ig) Bx (0) = br, 


and chosen so as to replace (6) by 


(12.12) Bi(c)=kBy1(2), k>1, 
and (8) by 
(12.13) B,(1)=B,(0), k>2, 


for every k > 0. One chooses 


to simplify the formulae as much as possible. Again calculating straightfor- 
wardly one obtains 


Bix) = bor +b, 

Box) = box” + 2b) x + bo, 

B3(x) = box? + 3b, x? + 3boxr + bs, 

Ba(x) = boat + 4b 2° + 6box? + 4b3x + ba, 

Bs(x) = box? +5b,24 + 10bgx + 10b3x” + 5b4x + bs 


and more generally 
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(12.14) B,(a2) = 3 i byx*-?, 


p=0 


a formula which, here again, implies (12) for any choice of the by. It depends 
only on (12) and is not sufficient to determine the b,; but (13) can be written 


k k 


1 Tr 2b, — 0, 
1 + 3b, + 3b2 =0, 
1+ 4b; + 6b2 + 4b3 = 0, 


etc., which allows one to calculate the Bernoulli numbers b, afresh, step-by- 
step. Euler, who discovered them in another way as we shall see, and must 
certainly have sought an explicit “formula” for the solution, had come to 
the conclusion that there probably was none; posterity has confirmed this, 
and has even quasi-proved it, by observing that the b, increase with a speed 
too prodigiously fast to be expressible by algebraic, exponential and other 
functions. You will find a little later their values for p < 30, calculated by 
Euler; it seems quite implausible, considering the taste of the Bernoullis for 
numerical calculations, that Jakob had not pushed the calculations beyond 
bio = 5/66, but he did not publish them. 
Let us now show that instead of (5) one has 


(12.16) By (2 +1) — By(x) = ka}, 


This is clear for & = 0. If (16) holds for k—1 the relation (12) shows that it is 
true up to an additive constant. But by (13) it is correct without an additive 
constant for « = 0. Whence (16). 

Finally, (16) shows that 


B(n+1) = [Br(n+1)— By(n)] + [Be(n) — By(n—-1)]) +...+ 
+ [Br(2) — By(1)] + Be(L) =k (n® 1 +... + 147+) + dy, 
whence 
(12.17) 1e-1 ,.. 4n*-) = [Be(n +1) = by] /k. 


A comparison with (4) shows that 


or, by (16), 


§ 2. Summation formulae 231 


By(z) = kAg(ax) — ka*®~* + by 
k-1 Fie 
(12.18) = HR eh 


(14) then shows that kb; = ka; — k, whence 


a; =b) +1=1/2 


per k 
( 1) = (7) 


Ap = bp/p. 


In view of Bernoulli’s relations between the a, and the coefficients A, B, C, 
etc. we see that they are just be, b4, be, etc. 
We still have to show that, according to the first formulae, 


and, for p > 2, 


whence 


(12.19) bbe Ha 0: 
Since b, = B,(0) = By(1) this will follow from the relation 
(12.20) B,(1 — x) = (—1)* B, (2). 


To establish this one puts C(x) = (—1)*B,(1 — x) and confirms by a one- 
line calculation that the Cy satisfy the conditions (12) and (13) as well as 
Co(x) = 1. Now these conditions determine the By, fully. 

Here, to conclude, are the values of the Bernoulli numbers as calculated 
by Euler: 


bo=1, b =-1/2, b=1/6, b= -1/30, b =1/42, 
bg =—1/30, 019 = 5/66, by2 = —691/2730, by =7/6, 
big = —3617/510, big = 43867/798, ba = —174611/330, 

bog = 854513/123, bog = —236364091/2730, bag = 8553103/6, 
bos = —23749461029/870, _b39 = 8615841276005 /14322, 

bsy = —7709321041217/510, —b34 = 2577687858367/6. 


13 — The power series for cot z 


By the definition of the binomial coefficients the recurrence relation 


1 1 
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can be rewritten as 
! = ! — 4 


and, in this form, evokes the formula for the multiplication of formal power 
series; to be precise, it is equivalent to the identity 


(13.1) S 7 bpX?/p! S$" X9/(q+ 1)! =1, 
0 0 
or, multiplying by X, to 


fexp(X) — 1] ]. >) axl = 


where, we recall, we have put X!?l = x? /p!. We do not know the radius of 
convergence of the series bpzlPl a priori, but we know that the power series 


(13.2) z*(e® —1)=142/2+ 27/34... 


converges for any z. By the general theorems on analytic functions (Chap. II, 
n° 22, particular case of Theorem 17), we know that the reciprocal of the 
function (2) admits an expansion in a power series on a neighbourhood of 
z = 0; by (1), this series must be 7 b,z!?!. This shows on the one hand that 
the radius of convergence R of this series is not zero — a nonobvious result 
since for the moment we do not know the order of magnitude of the b,, — and 
on the other hand that 


(13.3) z/(e*-1) = )0 bazl"l =1- 2/24 27/12- 24/7204... 


for |z| small enough. In fact, and as we shall see with the help of general 
theorems on analytic functions, the relation (3) is valid in the largest disc of 
centre 0 where the left hand side is analytic or holomorphic, i.e. where e* — 1 
does not vanish, whence 

R=2r, 


a result which we shall find again a little later, without recourse to Cauchy 
or Weierstrass. 
In the formula (1), let us replace the constants b, by the Bernoulli poly- 


nomials 
B,(t) = S- (7) beat” = p! S- bmt” /m!n!; 


m+n=p 
it follows that 


S > Bolt )X? /p! = Soot ee imal > Om xlm l(tX) [nm] _ 
= exp(tX) S > by, X!™ 
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whence, in view of of (3), 


(13.4) S > Bp(t)z? = ze /(e? — 1), 


a relation valid, here again and for the same reasons as above, for |z| < 27; 
on the contrary, t can be an arbitrary complex number. 

From this one may deduce the power series expansions of the functions 
coth z and cot z. For the first, observe that, by (3), 


z —z 2z 1 2 
z.cothz = 25 — = 2.5, — a E22 = i z+ So bn(2z)”/n! 
whence 
(13.5) z.cothz = 14 27/3— 2/45 +22°/945 — 
— 2° /4725 + 2z1°/18711 —.... 


For z.cot z = iz.cothiz one then obtains 


(13.6) z.cotz = 1— 27/3 —2*/45—225/945 — 2° /4725—..., 


1— > |ben| (22). 


I 


Now we saw at the end of n° 22 of Chap. II that if one puts 
cotx =1/a—cjx—c3r?—..., 


then 
TP Cop—1 = 2 1/n?? = 2¢(2p). 


Comparing with (6), we see that co,_1 = |bop|2??/(2p)!, whence 


(13.7) bool = Pan 6(2P) 
which reduces the calculation of the sums S>1/n?? to that of the Bernoulli 
numbers. Stirling’s formula, which we shall establish in a little while, will 
show that bz, increases very rapidly when p — +o0, as the first numerical 
values have already suggested. 

The formula (7) enables one to calculate the radius of convergence R = 27 
of the power series )> b, 2” /n! directly; indeed, 


5 [nel /nt = Ye (\z|/2n)" = DE |2|/2mp)?”, 


n>2 
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and since this is a series with positive terms, the convergence of the left 
hand side is equivalent to the unconditonal convergence of the double series 
obtained (Chap. II, n° 18, Theorem 13) so presupposes, in particular, that 
of the partial series obtained by summing over n for given p; this requires 
|z|/2mp < 1 for every p > 1 and thus |z| < 27. Convergence for |z| < 27 is 
then obtained by interchanging the summations with respect to n and p and 
recognising the convergence of the series > |z|?/ (4m?p? — |z|*). 
An even quicker method is to remark that, for s > 1, 


1 ae oe 
at x “du <¢(s) < i+ f x "dz = 
s—l 1 1 


(Chap. V, (24.1)), so that ¢(2p) lies between 1 and 2 for any p > 1, whence 
bopz2? /(2p)! x (z/2m)*? and 


bop X (2p)! /(20)??. 


Ss 
s—l 


14 — Euler and the power series for arctan x 


The sums of powers and the Bernoulli numbers reappear chez Euler in 1739 
when he calculates the integral 


£ ie dt 
arctan % = a ey 
9 1+? 


by the method of Cavalieri and others, i.e. as the limit of the Riemann sums 
8n = d)na/(n? + p?x?) corresponding to the subdivisions of [0,2] into in- 
tervals of length x/n, the sum being extended over the p € [1,n]. Since 


ME x/n _@ x ywprhat 
n?+ pa? 1+ p2a?/n? on a nek ? 
the Riemann sum considered can be written 
[o.e) 
(14.1) a= ¥ (1) A eo eae 
k=0 


which reintroduces the sums of powers, here the even powers, of the first 
nm integers. One remarks in passing that the first series converges only if 
|px/n| <1, ie. |x] < 1 since p € [1,n], but this is a detail. 

Without referring explicitly to the Bernoulli formulae, Euler uses them 
to write that 


8, = na/n— (n?/3+n7/2+n/6) 23 /n? + 
+ (n°/5 + n*/2+n3/3— n/30) 2° /n? +... 
= «—(1/341/2n+1/6n7) «+ 
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+ (1/5 + 1/2n + 1/3n? — 1/30n*) 2° -... 
= (x-2°/34+2°/5—...)—(e-2? +a2°-a2"' 4+...) 27/2n- 
— (x — 22° + 82° — 427+...) x? /6n? — 
(a 5a? + 142° — 302” 4 £20) x /30n* — 
— (x — 2807/3 + 42° — 13227 +...) 2°/42n® + &e. 


The expressions between ( ) may seem bizarre to you, but for Euler it is 
obvious that the coefficient of 2”/?n™ (m = 2,4,...) is the series 


(m+1)(m+2) 5. (m+1)(m+2)(m+3)(m+4) 5 
tal) 2 23 ew 2.3.4.5 Say 
so obvious that he does not prove it, and for good reason: he would have 
to use (12.14) and (12.17), which he does not write. Moritz Cantor, though, 
who has seen many other displays of acrobatics, tells us (p. 673) “its infinite 
form does not please Euler and he launches into a stunning [verbliiffende] 
transformation” of his formulae. 
Indeed, using the binomial series for a negative integral exponent, 


MUm(2) = mae —m(m+1)(m+4 2)x3/3! + 
+ m(m +1)(m +4 2)(m+3)(m+4 4)a4/4!—... = 
= [(l-ix)-™-(1+ia)~“™] /2i= 
= [1 +ix)™—(1—-ix)™] /2i(1+27)" = 
= [mx —m(m-—1)(m— 2)2?/3!+ 
+m(m-—1)...(m—4)x*/4)+...]/ (1427) 


by the binomial theorem. Finally one finds easily that 


(14.2) 8. = (x—2°/34+2°/5—27/7+...) — 
x x 2x 
nl+22)  2.6n21+22)2° 1 


x4 4y 4.3.2 3 
x 
430n¢(1+a2)4\ 1 123 
For x = 1 for example, in which case the first term of (3) equals 7/4, one 
finds 


= 4n ’ 4n , 4n ’ ; 4n 
nett n2+4 5 n249 °° n24+n2 
1 1 1 1 5 6 


' 6'In2 42°23. 38n6 ° 66° 5n10 7? 


a formula “correspondingly more exact as n is large” according to Euler who 
immediately adds that despite appearances, the series (2) converges only “up 
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to a certain rank”, “whatever that means” after which its terms again start 
to increase ... 

Recall that if one expands 1/(1+¢?) as a geometric progression the integral 
for arctan x immediately gives 


arctan 2 = 2 —2°/3+2°/5-... 


for |a| < 1, which is the first term of (2). I do not know what Euler had in 
mind in publishing his “stunning” calculations, but one has to admit that 
his introduction of the Bernoulli numbers into the machine leads, as always 
with him, to mathematical pyrotechnics. 


The situation and the calculations would in fact be more lucid if instead 
of starting from the function 1/(1 + 2?) one started from an “arbitrary” 
function f. For let us write 


; 1S 
(14.3) [seat = tim 5 Flo/n) = timpn(f) 
p=0 
and use the Maclaurin series 
(14.4) f@sd 7 Oe ia 
which replaces the geometric series 1/(1+ 27) =1—a?+a+—... Calculating 


formally — Euler never did otherwise —, we find, using (12.14) and (12.17), 


f®(0) 
In(f) = Soe = 20 eT ET) Bel”) — bees] = 
Oe geo 1-p _ 
= ae aH p )aynt* 7 
=> Ug a 1(0)/(k +1)! 
pao” 


or, putting k=p+h, 


in(f) = erry (PEE) emote + 0} 
CO b Co 
D>) On 
p= h=0 


The series 
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(14.5) f)(0) /1l + fP+Y (0) /2! + fe?) (0)/31+..., 


involving h is the Maclaurin series of f~) (1) without its first term f~)(0). 
Thus one finds 


(1485) pn A) =o SP [p(y — f°-Y0) 


InP 
p=o 


For p = 0, one has by = 1 and there remains f(~))(1) — f(-))(0), where 
f‘—» is in reality a primitive F of f as one sees on putting p = 0 in (5). 
The term p = 0 in (6) is precisely the integral of f over [0,1] that we are 
calculating, so that (6) actually expresses the difference between the latter 
and the sum j1,,(f'). For p = 1, one has b; = —4 and one finds [f(0)—f(1)]/2n. 
For p > 2, the odd p do not feature. By the definition of 4,(f), multiplying 
the two sides by n and adding f(1) to the two sides, one thus finds in the 
final analysis the formula 


(14.7) f(0) + f(1/n) + eaien = 
=» [103 Jat + 5(F(0) + FY] + 


bop 2p—-1 Qp-1 
De aan 2 [fr (1) — feP-Do)] . 


One would find Euler’s results again — apart of course from the “stunning” 
transformation which is very specific to the function 1/(1 + t?) — replacing 
the function t + f(t) by t + f(tx), whose derivatives are the functions 
f (tx)a*; this transforms the integral (3) of f over [0,1] into its integral 
over [0, 2]. 

If on the other hand one applies this formula to t+ f(nt), which replaces 
f(x) by n* f™ (nx), one obtains 


(14.8) f(0)+fQ)+...4+ f(n) = 
= [Hodes 5) + £0)+ 


Co 


+B [£27 D (my) — f-D)]. 


oy (2p)! 


It goes without saying that these purely formal calculations are in general 
meaningless apart from the case where f is a polynomial and where the 
ie series reduces to a finite sum. (Evercise. Verify the formula for 
f(x) = «*.) Even if the function f is represented everywhere by a convergent 
Maclaurin series, it is not clear that these permutations and groupings of 
terms are legitimate, and in fact the result (8) is almost always a divergent 
series. If on the other hand you apply (8) to a function of period 1, all the 
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terms of the right hand side are zero apart from the first two, and you will 
find, for n = 1 for example, the fanciful formula 


1 1 
5 (#0) + Fay) = fr flea... 

0 
But this is a beautiful exercise in calculation, and we shall see later that one 
can, as one does in replacing the Taylor series by a finite sum with a con- 
trollable “remainder”, obtain a result which yields a very precise asymptotic 
evaluation of the left hand side of (8). 


15 — Euler, Maclaurin and their summation formula 


The relation (14.8), which is the formal version of the Euler-Maclaurin sum- 
mation formula, had in fact already been published by Euler in 1736 in the 
Commentariit Academiae Petropolitanae and would appear again in Maclau- 
rin’s Treatise of Fluaions of 1741; their methods are almost identical, and 
there is every reason to believe that Maclaurin had not seen Euler’s memoir 
before sending his manuscript to the printer. In both cases we have formal 
calculations. Let us, for example, set out the heroic Scot’s method, who, at 
this late date, still militates on Newton’s side. 
Starting (in modern notation) from the formula 


(15.1) [ f) (t)dt = ss f?*™) (0) /(n + 1)! 


n=0 


which one obtains by integrating the Taylor (or, on this occasion, Maclaurin) 
series of f)(t) or, for p = 0, of a primitive of f as in (14.5), Maclaurin tries 
to express f(0) as an (infinite ...) linear combination 


pi = ai ; (p) 
(15.2) t(0) ~ : | f(t)at 


of the left hand sides, with universal constants ap, i.e. valid for every func- 
tion f. On substituting the expressions (1) in (2), he finds the identity 


(15.3) So apf?) (0)/(n + 1)! = f(0) 


summing over all pairs of integers p,n > 0. Since the derivatives can be 
chosen arbitrarily, as Emile Borel proved a little later, it is necessary (or it 
suffices) that the terms containing the derivatives of order > 1 disappear, i.e. 
that for every k > 1 the total coefficient of f“) (0) corresponding to the pairs 
(n,p) such that n+ p =k should be zero. This can be written 


(15.4) ao/(k+1)!+ a1/k!+...+ax/1! = S > ap/(k—p+1)! =0; 
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now clearly ap = 1 since f(0) occurs in (3) only for the pair (0,0). Maclaurin 
and Euler then deduced the numerical values of the a,, and if one puts 
ap = b,/p!, though they did not, one again finds that the coefficients satisfy 


k+1 
(15.5) bo = 1, SP ( oy =0 
o<psk \ P 


since the binomial coefficient equals (k + 1)!/p!(k — p+ 1)!. Miracle: the b, 
are the Bernoulli numbers! 
This done, (2) can be written 


(56) 0) =f sae 510) - FE) 
bap 2p-1 2p—1 
ton eo 


as these Gentlemen clearly affirm, after calculating the first b,, that they 
vanish for p = 3,5, &c. However, like everyone else at the time, they provided 
only the first terms of the series. 

On replacing t+ f(t) by tr f(t+ 2) one obtains 


atl 1 
(5.7) fa) = ff 5 [le +1)- F@))+ 
bop | (2p-1) (2p-1)()] - 
tot eae) 
on replacing x by p and adding from 0 to n — 1 one recovers (14.8). 


16 — The Euler-Maclaurin formula with remainder 


Following these excursions into the history of the subject, let us move on to 
the correct methods, due to Jacobi (1834) for the expression of the remainder, 
and to H. Wirtinger (1902) for the method of integration by parts, as Hairer 
and Wanner tell us (p. 162). This is exactly the method we explained for 
obtaining Taylor’s formula (Chap. V, n° 18), except that instead of choosing 
polynomials P, satisfying Pp = 1, Pi = P,-1 and vanishing at the right end 
of the interval of integration, one chooses polynomials taking the same value 
at its two end-points. If these are 0 and 1, we must then assume Py = B, 
and the method expounded in Chap. V leads, under the same hypotheses, to 
the relation 


f(t) = $0) = 


p=1 


(-))" 


r| 


F(e)By(0) +O" Lge B, wae. 
0 


Since B,(x) = x — 4 and B,(0) = B,(1) = b, for p > 2, it follows that 
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r 


[F'(0) + FA) + S(-1)?2bp [F(2) - FO] /v! + 


Ss 
ae 
= 
| 
peas 
| Aeon 
j=) 
St 
I 
NlR 


Since b3 = bs =... = 0 one can replace (—1)?~! by —1 in the >; by applying 
the result to a primitive of f, which transforms f(1) — f(0) into the integral 
of f over [0,1], and f’(0) + f’(1) into f(0) + f(1), one finally finds 


(16.1) 51A00) + #0] = fh fled + Yo by [7 - F°-M0)] 


— | f (2) B,(«)dx. 


r 


To obtain the Euler-Maclaurin formula one considers a function f defined 
and of class C” on an interval [0,n], applies (1) to each function f(x + k), 
and adds the relations so obtained. On the left hand side one finds 


SFO) + FI +..+5[F(2—1) + Fln)] = FO) +... + F(n) — 54 + Fn) 


On the right hand side the sum of the integrals in f yields that of f over 
[(0,n]. In the }* on the right hand side all the terms cancel in pairs, except 
for the values of the derivatives at n and 0. Finally, to write the sum of the 
integrals conveniently in terms of B,, one introduces the function B*(x) of 
period 1 equal to B,(x) on [0,1], clearly given by 


(16.2) B, (a) = B,(x — a) 
where [2] is the integer part of x; then 


k+1 


(16.3) ‘i fOG@+ WB G)axS | PO (a) BY (a)de, 


which, by addition, yields the integral of the same function over [0,n]. For f 
of class C?" one then has the final result, namely 


(16.4) f(0)+...+f(n) = [ feayae + S180) + Fen)] + 


ae 


+ Os 


p=1 


— ay f £2? @) Biel ae 


a [#9 (n) — f-Y(0)| - 


For r = 3, for example, 
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fO) +... + F(n) = 
aif f(a)dx + [f(0) + F(n)]/2+ [f"(n) — f"(0)]/12 - 
(f" (n) - f”(0 )] /720+ (fe"'(n _ f!"(0 )] /30240 — 


a, Fo) 


Exercise. Let f be a function of class C?” on R. Show that 


n)= [Heder ay fF) B3,e)ae 


subject to hypotheses to be found. 


df 


17 — Calculating an integral by the trapezoidal rule 


If, in (16.4), one transfers the term $[f(0) + f(n)] to the left hand side, it 
becomes 


and is simply the sum of the areas of the trapezia constructed on the verti- 
cals joining the integer points of the x axis to the corresponding points of the 
curve. If f is a function of class C?” on [0,1] and if one applies the preced- 
ing results to the function f(2/n), defined between 0 and n, which replaces 
f(x) by n-* f)(a/n), one immediately finds the relation 


[ f(x = [F(0) + fA/n)]/2n+...+ [fF — 1/n) + f(L)]/2n — 


(17.1) — [F/(1) = f"(0)]/12n? + [F""(2) = F"(O)]/720n4 — 
= Bay [ FOP-D(1) = fP-9(0)] /(2r)!n?" + 


1 1 7 : 
. anne | fC? (7) B3,(na)da. 


The left hand side represents the “curvilinear” area m(f) bounded by the 
graph of f, the x axis and the verticals x = 0 and x = 1. On the right hand 
side one then has the sum T,,(f) of the areas of the trapezia inscribed in the 
graph of f and having as vertical sides the lines x = k/n. If generally one 
puts 


cof) = bap [FP (1) = f-9 (0) /(2p)}, 


one then finds 


(17.2) To(fyamf)+ aft. te(pin? +.) /0"" 


242 VI — Asymptotic Analysis 


where 
a * $(2r) (a 5,(x) ax 
(17.3) (=a f £0?) Bi (na)ae, 


This expression remains bounded as n increases indefinitely, for the functions 
B* are of period 1 and are polynomials on [0, 1], so bounded in R. The relation 
(2) then can be written 


(17.4) Ty(f) =m(f) + ea(f)/n? +... + er f)/n?" +O (1/n?"*?) 


and shows that, if f is C°°, the difference T,,(f) —m(f) is represented by the 
asymptotic series > c,(f)/n? in the sense of n° 10. This also means that 


Tr(f) —m(f) ~ea(f)/n®, Tif) — m(f) — ea(f)/n? ~ e2(f)/n*, 


etc. 

The situation becomes curious if f is the restriction to [0,1] of a periodic 
function that is indefinitely differentiable on R and not only on [0,1]. Then 
f(1) = f (0) for any k, so (4) reduces to 


m(f)=Ta(f)+ O(1/n*) for any k. 


18 — The sum 14+ 1/2+...+1/n, the infinite product for the I 
function, and Stirling’s formula 


By simple arguments one may prove the existence of a constant C’,, or 4, 
Euler’s constant, such that 


(18.1) lim(1 + 1/2+...+1/n—logn) =C=y, 


a result which provides an excellent order of magnitude for 1+...+1/n for 
n large. But the Euler-Maclaurin formula provides a complete asymptotic 
expansion for it. 

First of all, consider again the general formula (16.4) and assume that 
in it the derivative f©")(x) is absolutely integrable on the interval [0, +00]. 
This is then true for f") (x) B3,.(a) too, since the functions B* are bounded. 
The integral from 0 to n is then the difference between the integrals from 0 
to +oo and from n to +oo. Putting 


(18.2) CUA) = 540) — D2 bap f (0) /(2p)! - 


+00 
— apf £2 Bi (a)de, 


it follows that 
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(18.3) f(0)+...+f(n) = [tear + 0.8) + 540) + 


+S bop f 2?) (n)/(2p)! + pr(n) 


p=1 
with a “remainder” p,(n) given by 
ae. Af ats 
(18.4) pin) = ay fF (@)Bigla)ae, 


If f and its successive derivatives tend to 0 at infinity then 


C(f) =lim 700) Sc ck HGS i F(a] 


for every r: the “remainder” p,(n) tends to 0 since the function under the 
J sign is by hypothesis absolutely integrable at infinity. This shows that the 
constant C(f) does not depend on the number r chosen. One might call it 
“Euler’s constant for f” because he had already exhibited it (notation C 
or y) in the case where f(x) = 1/2. 

In this particular case, and in other similar cases of functions which are 
defined for x > 0 but infinite at x = 0, one has to modify the formulae, i.e. 
consider the sum f(1) +...+ f(n). This comes down to applying the initial 
formula to the function f(a+1) or, equivalently, to replacing the limit 0 by 1 
in the derivatives and integrals. For f(x) = 1/x the derivatives at + =n are 
easily calculated and the remainder is O (n-2") since the function integrated 
is O (a~?"—'). On replacing r by r+ 1 the formula (3) can in this case be 
written 


(18:5) 14+1/2+...+41/n= 
= logn+ C +1/2n —1/12n? + 1/120n* — 
— 1/252n® + 1/240n® — 1/132n1° + 691/32760n'? — 
—1/12n"* +... = ba, /2r.n?" + O (1/n?"t?) . 


Thus one sees that the sum 1+1/2+...+1/n = 5s, is approximately equal 
to log n, the error being approximately equal to Euler’s constant 


C=7=0,577 215 664.... 


But (5) is much more precise. For example, in the simplest formula 


+00 
(18.6) Sn =logn+C+ 1jan+ f a? BF (x)dz, 


one has |Bj(x)| < 4 since BY(x) = Bi(x) = x — 4 between 0 and 1. The 


integral in (6) therefore lies between —1/2n and 1/2n, so that, on adding the 
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term 1/2n to the formula, one obtains a result between 0 and 1/n. In other 
words, 


(18.7) Sn = logn+C+4+6,/n with 0O<@, <1. 


For n = 10° one thus finds s, = 6.log10 + C to within 10~®; since 10 
lies between e? and e° its log lies between 2 and 3, which shows that sy, 
lies between 12 and 19; certainly a not very exact result, but obtained in 
probably less time than it would take a machine to calculate a million terms 
of the harmonic series to a dozen decimal places so as to obtain the result to 
within 107°. 

To improve this rough estimate one needs to know that 


log 10 = 2,302 585 092, 


a result generously provided, among many others, by the Founders, whence 
one deduces s, = 14,392 726... The same argument shows that on calculat- 
ing the sum of the first 101°° terms of the harmonic series one finds a result 
equal, to within 1, to 100.log10 ~ 230. One finds in Hairer and Wanner, 
II.10, apart from the very precise numerical results, a reproduction p. 167 of 
a letter from Euler to Johann Bernoulli, dating from 1740, in Latin, and in 
an impeccable script, where the former informs the latter of his numerical 
results. 


From this one may deduce an expansion of the function I’ as an infinite 
product. We have already seen [Chap. V, eqn. (23.6)] that 


(18.8) I'(s) = limn!n*/s(s+1)...(s +n) 


for Re(s) > 0. The reciprocal of the right hand side can again be rewritten 
as 


(18.9) s.lim(1 + s)(14+ 8/2)...(1+s/n)n™; 
now n~* = e— 5-8" and logn = (1+1/2+...+1/n) — C + o(1) by (6); so 
no8 w oe 8A4+1/2+...41/n—C) = ese e8/2 4: eel. 


whence 


(9) = se@®. lim [[a + s/p)e~*/?. 
p=1 


But, for p large, 
(1+ s/p)e~*/? = (1+. s/p) (1— s/p+ O (1/p?)) = 1+ O (1/p?) 


is, for Re(s) > 0 and even for every s € C, the general term of an absolutely 
convergent infinite product (Chap. IV, n° 17, Theorem 13), a product whose 
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value is 4 0 for every s 4 —1,—2,... Returning to (8), we conclude that, for 
Re(s) > 0, the F function is everywhere 4 0, and is given by 


Co 


(18.10) 1/T(s) = se°* |] (1 + 5/n)e*/" 


where C' = ¥ is Euler’s constant, a famous result due to the latter. 
This formula is in fact valid for any s € C. First, it is clear that on 
retracing the calculations which brought us from (10) to (8), we have 


(18.11) se©s [[a +s/n)e~*/” = lim s(s + 1)...(s + n)/n°n! 

1 
for any s € C; the limit exists, like the infinite product, on all C and not 
only for Re(s) > 0. But if one denotes the right hand side of (11) by f(s), 
one has, for any s € C, 


sf(s +1) =lims(s+1)...(s +n+1)/n*t'n! = f(s) 


since n°t!n! ~ (n + 1)8(n + 1)! as one sees immediately. Now we know 
(Chap. V, n° 22, Example 1) that ['(s +1) = sI(s) for Re(s) > 0. The 
two members of this formula being holomorphic in C, the negative integers 
removed, (Chap. V, n° 25, Example 5), and therefore analytic — in mathemat- 
ics, one may stoop to swindles so long as one warns the victims in advance 
— the equality is valid without restriction (principle of analytic continuation: 
Chap. II, n° 20). Now consider the product g(s) = f(s)I'(s). We know that 
g(s) =1 for Re(s) > 0 by (10), and that g(s+1) = g(s) for any nonnegative 
integer s. It follows clearly that g(s) = 1 everywhere, qed. 
Combining (10) and (11), one also finds that 


(18.12) 1/I'(s) =lims(s+1)...(s+n)/nén! 
for any s € C. Consequently, 


1/I(s)F(-—s) = 
= lims(s +1)(s +2)...(s+n)(1—s)(2—8)...(n+1—8)/n(nl)? = 
= s.lim (1? — 5”) (2? — 5”)... (n? — 8”) (n+1—8)/n(nl)? = 
= s.lim (1 — s”) (1 — 57/2?) ... (1 — s?/n?) 


since (n+ 1 — s)/n tends to 1. Whence 
1 
1/I'(s)P(1—s)= 1—s?/n?) = —si 
/T(s)F1 — s) s]I( s*/n?) _ sins 
[Chap. IV, eqn. (18.16)], a formula due to Euler and which one also writes 


(18.13) I'(s)f(1-s)=7/sinas. 
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There are all sorts of other ways to establish these properties of the Gamma 
function, among many others. 


Among the functions whose derivatives are integrable at infinity there 
appears f(x) = log a, for which 
f(w) = (-1 r= Dla. 


Formula (18.3) clearly applies for r > 1 and immediately gives 


1 
(18.14) log(n!) = nlogn—n+14 5 logn + C(f) 4 


+ S© bap /2p(2p — 1)n??-* + p(n); 


the first three terms come from calculating the integral of logx from 1 to 
n (primitive: x.loga — x) and then p,(n) = O(1/n?"-1) since fC” (x) = 
O (eo) at infinity; this assumes that r > 1 for otherwise the integral for 
the remainder would be divergent. 

Rather than going over the expansion again, let us just deduce Stirling’s 
formula from it, for r = 2. In this case we obtain 


(18.15) — log(n!) — nlogn+n— Sloan —c=1/12n+ 0 (1/n*) 
where c=1+C(f). The left hand side is the log of 
Un = nle"—¢/n*2 
and since the right hand side tends to 0, we see that u,, tends to 1, whence 
(18.16) nl ~ ene "Vn. 


While we have no information on Euler’s constant y for the harmonic 
series — one does not even know whether it is algebraic or transcendental —, 
we can, here, calculate 


(18.17) c= log v2z, whence n! ~ V27n(n/e)”, 


but the method is not particularly transparent. We start from Wallis’ formula 
(Chap. V, n° 17) 


ies 2242... (2n)? : 
me = 232, (Qn —12Qn+1) 
afi DPA* con)? 4 Der nly 


in 92R7 An eI ((2n))2 Qn +1) 


and write (2n!)? ~ (2n)?nt2 e2"*¢ by (16). It follows that 


g4nyAnt+2_—4n+4e ne2e 


Qn = ~ e/4 
t/2 ~ To antie—int2e(9n 1) ~ Bante) ~° /* 


whence e?° = 27 and Stirling’s formula (17). 


§ 2. Summation formulae 247 


19 — Analytic continuation of the zeta function 


In the Euler-Maclaurin formula let us choose f(x) = 1/x* with Re(s) > 1, 
so that the series 


(19.1) (9) = ue =) 


converges. Since here 
(19.2) f(a) = (-1)"s(s+.1)...(s +r—1)/25*, 


(16.4) can be written 
(19.3) f()+...4+ f(n) = ig x Pdx+ at +n-%)+ 


+ S> baps(s +1)... (8 + 2p — 2) (n~*-7P*1 — 1) /(2p)! + pp(n) 


with a remainder 


ed) ce. (bar yp se n)2-8 2" der 
p,(n) = @ryi | B3,(2) dx. 


When n increases indefinitely, the left hand side tends to ¢(s), the first in- 
tegral on the right hand side tends to 1/(s — 1) since Re(s) > 1, the terms 
containing a power of n tend to 0, and the integral in the remainder converges. 
Multiplying by s — 1, one finds, in the limit, 


1 1 


= i 


r 


— $5 bap(s + 1)s...(s + 2p — 2)/(2p)! + 0/(s) 


or, writing out the first terms, 


1 1 8 s(s+1)(s+2 
uo) (3) = s+ ea i 
s(s + 1)(s + 2)(s + 3)(s + 4) 

- 42.6! 
s(s+1)...(s + 2r — 2) 
(ry! on); 


bor 


where we have put 


s(s+1)...(s+2r—1) ft? 


(19.5) o,(s) = Gri 


B3,(a)a* "da. 
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These formulae assume Re(s) > 1, but the integral (5) converges for Re(s) > 
1—r, and the other terms of (4) are polynomials in s. We may therefore use 
(4) to define ¢(s) in the half plane Re(s) > 1—r, apart from the point s = 1; 
and since r is an arbitrary integer > 0, in this way we obtain a definition of 
the zeta function valid in the whole complex plane, the point s = 1 deleted. 

The point of these calculations is that they furnish a holomorphic function 
on C — {1}, equal to ¢(s) on the half plane Re(s) > 1 where the series 
converges; to see this it suffices to argue from the integral (5) as we did in 
Chap. V, n° 25 for the function I’(s): since Bernoulli’s function is bounded on 
R, the function which one integrates, holomorphic in s, is, on the whole half 
plane Re(s) +r > 1+, dominated up to a constant factor by the function 
a +~, integrable on [1, +00]; Theorem 24 bis of Chap. V, n° 25 then yields 
the result. 

In fact, the function ¢ is even (sic) analytic. Not yet having the general 
Cauchy- Weierstrass theory at our disposal we have to use a workaday method 
to prove it. We write 


x ° = exp(—s.logz) = So (-1)"s" log” «/nl, 


substitute this result in the integral (5), and integrate it term-by-term, leav- 
ing the justification of this operation until later. We obtain the series 


+00 
(19.6) Se Ans” /n! where a, = (—1)” B;,.(x) log” v.a~ "da. 
1 
Putting M = sup |B3,(x)|, we then have 
+00 
(19.7) lan| < u | log” x.a~ "dx 
1 


and since, at the least, we have to satisfy ourselves that the radius of con- 
vergence of the power series (6) does not reduce to 0, we need to evaluate 
the integral (7). Convergence is obvious for r > 1 since log” x is O(a) at 
infinity, for every a > 0. The change of variable x = e” reduces this integral 
to f u"e—")“du where one integrates now from 0 to +oo. A second change 
of variable (r — 1)u = v reduces us to 


+00 
Geir s i ume de = (r — 1)" 71 (n +1) =(r-1)-7 
0 
by Chap. V, n° 22, Example 1. The inequality (7) then becomes 


lan| < Mn!/(r —1)"*?. 


The series (6) is therefore majorised up to a constant factor by the series 
with general term |s|"/(r — 1)”, which converges for |s| <r —1. 
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The term-by-term integration used to obtain (6) is justified by Theo- 
rem 20 of Chap. V, n° 23. On the one side, the series to be integrated, with 
general term 

Un(a) = (—1)" B3,.(x) log” v.27" s"/nl, 


converges normally on every compact subset of [1,-+oo| ie. on every interval 
[1, 6] with b < +00, for, putting M = sup |B3,(a)|, one has, on this interval, 


ltn(2)| <M log” b.|s|"/nl!, 


the general term of a convergent series independent of x € [1, b]. On the other 
hand, 


Y- |un(e)| < M.exp(|s]-log2)2-" = Mal!" = p(x) 


is a function integrable on [1,-+co| since, to make the power series (6) con- 
verge, we have already had to assume |s| < r—1 and thus |s| —r < —1. The 
formal calculation above is therefore justified. 

The integral (5) is therefore an analytic function of s in the disc |s| < r—1. 
So likewise by (4) is the function (s — 1)¢(s). But since r is an integer that 
may be chosen arbitrarily large, it follows that (s — 1)¢(s) is analytic on all 
of C, qed. 

We have thus shown that the function (s—1)¢(s) is the restriction to the 
half plane Re(s) > 1 of an analytic function on all of C. The latter is unique 
by the principle of analytic continuation of Chap. II, n° 20. Later we shall 
see that there is a simple relation between ¢(s) and ¢(1 — s). 

Formula (19.4), valid for every s 4 —1, applies mainly when s is an integer 
<0. The remainder (5) is then zero if one chooses r suitably, when one finds 
a rational value for ¢(s). One can calculate it for the small values of r: 


¢(0) = -1/2, (r= 1) 
¢(-1) = -1/24+1/2-1/6.2! =-1/12, (r =1) 
¢(-2) = -1/3+1/2—2/6.2! =0, (r = 2) 
¢(-8) —1/4+ 1/2 — 3/6.2! + 3.2/30.4! = 1/120, (r = 2) 


etc. In fact, 
¢(1 — 2r) = —bo,./2r, ¢(—2r) =0 


for any r > 1, as we shall see in Chapter XII, using other methods. 


VII — Harmonic Analysis 
and Holomorphic Functions 


§ 1. Analysis on the unit circle — § 2. Elementary theorems on 
Fourier series — § 3. Dirichlet’s method — § 4. Analytic and holo- 
morphic functions — § 5. Harmonic functions and Fourier series 
— § 6. From Fourier series to integrals 


1 — Cauchy’s integral formula for a circle 


It is not the tradition to treat Fourier series and the theory of analytic func- 
tions together. Nevertheless the two theories are closely related. If 


f(z) = Do anz” 


is a power series of radius of convergence R > 0 the function 
(i) f (re™) & S- aarnetnnt 


which, for 0 < r < R, represents f on the circle |z| = r is an absolutely 
convergent trigonometric series of period 1. It follows [Chap. V, eqn. (5.13)] 
that i 

2Qrit —2rint _ Ayr” for nr = 0, 
(1.2) | f (re Je a= { 0 Pian & 
The integral (2) is zero for n < 0 since only positive powers n appear in the 
series (2); this shows, in passing, that the function t> f (re?*"*) is very far 
from being the most general periodic function. 

As we have seen in Chap. V, it follows from this that for |z| <r 


1 r Qrit i 
(1.3) f(2~= i ee aes 


re2nit _ x 
If we perform the change of variable ¢ = re?™ in (3) (a priori forbidden 
since the values are complex) and if we calculate @ la Leibniz, we have d¢ = 
2rire?™' dt, which allows us to write (3) in Cauchy’s form 
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(1.4) i f(0) a -{ 2nif(z) for |z| <r 


C-2z 0 for |z| >r 


where we integrate along of the circumference |¢| = r oriented traditionally; 
we are in fact dealing with a particular case of a much more general formula 
— one may integrate along arbitrary closed curves! —, obtained by Cauchy 
much later than (4), and which cannot be obtained by calculations of the 
preceding type. But (4) nevertheless shows that, in the disc |z| < r < R, 
one may calculate f from its values on the circumference |z| = r through an 
explicit formula of the simplest kind. 

Why does one find 0 for |z| > r in (4)? Because, putting u = e?""’, one 
may write 


(1.5) Mise ca SIE S-(ru/zy 


ru—z l—ru/z 


and obtain a convergent series which can be integrated term-by-term in (3). 
Therefore 


1 Qmit 1 
re Qrit _ n+l Qrit) 2n(n+1)it 
(6) [ neon ge (re \di= -S°(r/z) ik f (re Je dt, 
which causes the coefficients of index < 0 in the Fourier series representing 
f (re?*"’) to appear; but by (2) they vanish. 

If for example f(z) = 2” with n €N, one finds 


(1.7) Qniz” =| Gnds for |z| <r 
Iler 6 — 7 


(and 0 for |z| > r) or, putting a = z/r, 


1 e2ti(n+1)t 
(1.8) a” a ———_ dt for Ja] < 1, nEN. 
0 


e2nit —a 


' Let tr 7(t) = (a(t), y(t)) be a differentiable map of a compact interval J into 
C, whence a “curve”, the trajectory of the point y(t). If f(z) is a continuous 
function of z defined on an open set containing the curve we put 


[ s@de= [fora 
and more generally 
if u(x, y)dax + v(x, y)dy = df {uly(é)]2’(t) + v[y@] y'(é) f at 


if u and v are continuous on a neighbourhood of y(J). Formula (4) corresponds 
to the case where y(t) = re?”**. If s = @(t) is a map of class C' of I onto an 
interval J, the integral [ f(z)dz does not change if one replaces t + y(t) by 
st ¥[0(s)], by the Chain Rule: we have f[ y [0(t)] 0’(t)dt = [ y(s)ds, where the 
integrals are taken over I and J respectively. See Vol. III, Chap. VIII, n° 2. 
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One may go further and also calculate the derivatives 


(1.9) f(z) = So n(n-1)...(n—k+ Vanz™™* 
of f. One proceeds in the same way, but this time using the relation 
(1.10) Sin(n-1)...n—k+ qr * =h/(l-@g"", 


which follows by differentiation (Chap. II, eqn. (19.14)) in place of the formula 
Sq” = 1/(1 — q); one substitutes the a, given by (2) in (9), whence 


fOR)= 
— se n(n—1)...(n—k+1)2"*r™ i Pie ve ee = 
= Pe n(n—1)...(n—k+ yf (z/re2it)—* (re2nit)—* f (re?"**) dt 


where one integrates over [0,1]; one verifies, as for k = 0, that one may 
integrate the series term-by-term, whence, using (10), 


f(z) =f i, (1 = zfreemt) Gey" f (re2**) dt, 


(k) 1 pe2rit ae 
(1.11) f (=m f ————_____ f (re*™"’) dt 
0 (re2nit _ zr ( ) 

or again, a la Leibniz, 

’ - (k) d¢ 
(1.11’) nif”) (z) = k! HO ar ot ela 

Icl=r  (€ — 2) 

(11’) can be deduced formally from (4) by differentiating the factor 1/(¢ — z) 
appearing in the integral (4) k times with respect to z. In fact, Theorem 9 
of Chap. V, n° 9 (differentiation under the f sign) would allow us to justify 
this operation a priori, starting from (6) without intermediate calculations, 
since to differentiate an analytic function with respect to z is the same as 
differentiating it with respect to 7 = Re(z). 


The preceding assumes that the function f is analytic; what happens if it 
is only holomorphic, i.e. C' in the real sense on a disc |z| < R and a solution 
of the Cauchy equation 


(1.12) Dof =iDif ? 


As we want to show that f is in fact analytic we are forced to reverse the 
procedure, i.e. to introduce the Fourier coefficients 
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(1.13) an(r) = i f (re2™#t) e-Print gy 


for r < R, and to show that 
(i) they are of the form 


(1.14) An (r) = anr” 


with numerical coefficients a, independent of r, 
(ii) we have a, = 0 for n < 0, 
(iii) and 


(1.15) f (ret) = S- ay (r)e2™int 


for any t € R and r < R. On substituting (14) in (15) we will find an 
expansion of f(z) as an entire series, by (ii). 

We might establish points (i) and (ii) as of now, the first using (12), the 
second by observing that, by (13), the function a,,(r) must remain bounded 
as r tends to 0. Point (iii), on the other hand, assumes known the fact that the 
Fourier series of a function of class C1 is absolutely convergent and represents 
the given function. These points will be justified later. 

All this shows that the foundations of the theory of the analytic or holo- 
morphic functions rests on Fourier series or can be deduced therefrom. We 
shall see that conversely one may use the Cauchy formula to obtain the first 
theorems on Fourier series. 

In this chapter you will find only those properties of holomorphic func- 
tions which can be derived from the theory of Fourier series. Everything 
that depends on integrals over arbitrary curves (the Cauchy theory) will be 
expounded in Volume III. 
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§ 1. Analysis on the unit circle 


2 — Functions and measures on the unit circle 


The purpose of this § is to present some definitions and notations which we 
shall use constantly, and to clarify a number of preliminary questions. 

We shall adopt the notation T (the one-dimensional “torus”) to denote 
the set of the complex numbers u such that |u| = 1; some other authors 
denote it by U (the “unitary” group in one variable), not to speak of those 
who prefer to write R/Z... The aim of the theory of Fourier series is to 
expand “arbitrary” functions defined on T in series whose general term is a 
multiple of 27’ = u”, putting u = e?". Note that if one puts 


(2.1) x(u) =u” for every u € T 


for n € Z one obtains a continuous function on T such that 
(2.2) x(uv) = x(u)x(v) for any u,v € T. 


We shall see a little later that this equation has no continuous solutions other 
than the functions ut> u”, and it is this remark which is at the origin of the 
contemporary generalisations of the theory. 


(i) How to eliminate the factors 27 


As far as possible I intend neither to bore the reader with the factors 27 
and the exponentials which uselessly encumber this kind of mathematics, nor 
to inflict them on my two typists. I will therefore use the notation 


(2.3) elf) =e", e,(t) = 27" = e(nt) = e(t)” 


where the factors 27, relegated to the exponents, are invisible and can be 
absorbed into “macros” that can be typed globally; this convention is already 
to be found in Hardy and Wright. 

I earnestly advise the reader to reread n° 14 of Chap. IV on the imagi- 
nary exponentials, since it will be used constantly. The exponentials (1) have 
period 1 and in the sequel we shall consider functions of period 1 alone: a 
function f of period T is transformed into a function of period 1 on consid- 
ering t+ f(Tt) instead of f(t). Users may have excellent reasons to drag 
cohorts of functions cos(2mnt/T) after themselves, but we have none here. 


(ii) Functions on T and periodic functions 


To every function f on T there corresponds, on R, a function t +> f[e(t)] 
of period 1. Conversely, every function f(t) of period 1 on R can be considered 
as a function on the unit circle T: one puts 


(2.4) f(u)= f(t) if w=e(t), whence f(t) = f[e(t)]; 
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since f(t + 1) = f(t) there is no ambiguity, t being determined modulo an 
integer when one knows u. This is an abuse of notation since a function on R 
is not, strictly speaking, a function on T; but it is indispensable to be able to 
adopt both of these two points of view; and to use different notations for the 
two functions which correspond “canonically” according to (4) would make 
the text unreadable. 

This correspondence between functions defined on these different sets 
preserves continuity. Since the map t + e(t) of R on T is continuous, the 
continuity of f at a point of T trivially implies its continuity at the corre- 
sponding points of R. On the other hand, even though the map t +> e(t) is 
not globally bijective, it maps every compact interval J C R of length strictly 
< 1 bijectively onto a closed arc Kk C T of the circle. Restricted to J, the 
map t+ e(t) therefore has an inverse map K — I which is also continuous 
(Chap. III, n° 9). The map t+ e(t) thus inversely transforms every contin- 
uous function on J into a continuous function on K. This is equivalent to 
saying that a function defined on the circle |u| = 1 is continuous if and only 
if it is a continuous function of the polar angle, or argument, of the point u, 
as is obvious from a sketch. § 4 of Chap. IV on the uniform branches of the 
“function” Arg z also shows that on a neighbourhood of each point ug € T 
or even on T with a point removed, for example on T — {1}, but not on the 
whole of T, one may choose ¢ so that it is a continuous function of u = e(t). 


(iii) Characterisation of the exponentials 


The correspondence (4) allows us to show that the functional equation 
(2) has no continuous solutions apart from the functions (1). To start with, 
note that for every continuous solution of (2) 


ly(u)| = 1 for every u € T; 


this relates to the fact that, endowed with the usual multiplication and topol- 
ogy of the complex numbers, T is a “compact group”: since the continuous 
function y is bounded on the compact set T, it follows that for every u € T, 
the family of the numbers x(u”) = x(u)” for n € Z is likewise bounded; it 
remains then to show that the only complex numbers z such that 


sup |z”| < +00 
neZ 


are those of modulus 1, which is clear. Since (2) implies y Cae") = x(u)~, 
it follows that every continuous solution of (2) also satisfies 


(2.5) x (u-*) = x(u) 


and more generally 


(2.6) x (uv~") = x(u)x(v) 
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for any u,v € T. 
On the other hand, the continuous function x(x) = x(e(x)) = x (e 
on R satisfies the functional equation 


piled 


x(@ + y) = x(z)x(y) 


of Chap. IV, n° 13. We have shown there that every solution of the latter is 
of the form 


(2.7) x(2) = exp(cz) 
with a constant c € C, on condition that we know that the function y has a 
derivative at the origin. 

But, in fact, continuity suffices, and implies much more than differentia- 
bility at the origin. To see this, one chooses on R a function y € D(R) and, 
as in Chap. V, n° 27, one regularises x by means of the convolution product 


28) x+—la) = f 


x(x — y)ely)dy = Y x(2)x(y)~*p(y)dy = c.x(x); 
R 


R 

the constant c = f{ y(y)~'y(y)dy may be assumed to be nonzero, since, if 
xx were zero for every y € D(R), then so would be the function y (Chap. V, 
n° 27, Theorem 26). Now the function x « py is C™ for every y € D(R). So 
likewise is y. 

We may now apply the result of Chap. IV, n° 13 and write (7). But 
in the present case |y(xz)| = 1 for every x € R. Putting c = a + ib with 
a and b real, we have |exp(cx)| = exp(ax) for « € R, whence a = 0 and 
x(a) = exp(ibx) with b real. For the result to be of period 1 it is necessary 
and sufficient that exp(ib) = 1, ie. that b = 27n with an n € Z. Finally we 
find x(x) = exp(27inx), whence y(u) = u”, qed. 

We shall later call every function of the form us u” = x(u) a char- 
acter of T. This terminology comes from the theory of commutative groups 
(Chap. XI), where one considers the solutions of the functional equation (2) 
systematically on the given group G. If one assumes G commutative and 
finite — the simplest case, involving only algebra —, every function on G is, 
and in a unique way, a linear combination of characters of G, of which there 
are Card(G); this is the simplest version of the Fourier transform, though 
dating from very much later than Fourier himself (even though Dirichlet 
had used this idea for the group Z/nZ in proving his theorem on arithmetic 
progressions). 


(iv) Mean value of a function on a circle 


In the theory of analytic functions one often considers the mean value of 
a function over a circle, and, in that of the periodic functions, over a period 
interval. There is no difference between these two ways of integrating. 

First, on a circle |z| = r, a function of z, analytic or not, can, as we have 
seen, be transformed into a function of period 1 by putting 
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(2.9) z= re(t) = re?™ 


or into a function of period 27 on putting z = re*’. Its mean value around 
the circle |z| = r is, by definition, the number which we denote by 


ay 20 ; 
(2.10) [a (ru)dm(u y= fir f(re(t (re )dt 
0 
where T, we recall, denotes the unit circle |u| = 1 in the complex plane. More 
generally we put 


(2.11) m(f )= f tu \dm(u = [Kew Lf” peat 


0 


for every “reasonable”, for example regulated”, function f on T. We will use 
the norms 


fll = fla = sup |f(@)I, I4iln = f isca)ldm(u), 


fll = ( [ re? dma) 
as in R. 


We may in fact, in (11), integrate over any interval of length 1, since 


1 a+l1 a+l1 
(2.12) [ soae | (eat = f f(b+t)dt 


for any a,b € R for every function f of period 1 on R (Chap. V, end of n° 2). 

Besides, even in the case of arbitrary functions of period T on R, the 
formulae only mention the mean values of the functions over a period interval; 
it is convenient to write, here again, 


ain mina fron BoB fof 


as we study functions of period 1,27 or T; the sign ¢ dispenses us from 
writing the limits of integration since, by definition, it denotes the mean 


? This means, as one prefers, that the corresponding periodic function is regulated, 
i.e. has left and right limit values at every point of R, or that the function given 
on T enjoys the same property, the limit values at a point of T being defined in 
the obvious way. The BL theorem being valid for the compact set T, it comes 
to the same to require that for every r > 0 there exists a partition of T into a 
finite number of arcs of circles, of any kind, on each of which the given function 
is constant to within r. One may also define, using such partitions, the notion 
of step function on T, and, generally, transpose to T the arguments of Chap. V, 
n° 7. 
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value over a period (or even over several periods) of the integrand. In fact, 
and without express mention to the contrary, all the integrals in dt will be, in 
the rest of this chapter, apart from § 6, extended over any interval of length 
ale 

As we have seen in Chap. V, n° 5, which the reader is invited to read 
again, the essential formulae in the theory of absolutely convergent Fourier 
series (calculation of the coefficients and of the scalar product) stem from the 
“orthogonality relations” of the exponentials, the relation (5.2) of Chap. V. 
With the notation just introduced, they can be written 


J enluiertwam(u) = { ; : oie 


The scalar product (f |g) of two periodic functions introduced in Chap. V, 
eqn. (5.4), will now be written 


(flg) = / f(u)g(a)dm(u) = m(f9). 


The orthogonality relations thus signify that if y and y’ are two characters of 
T, then (x| x’) = 1 or 0 according to whether y and x’ are equal or different. 


(v) Measures on T 


The notation dm(u) in (11) indicates that we are integrating with respect 
to a measure on T. We have defined this notion in Chap. V, n° 30 in the case 
of a compact set X C C: one considers the vector space C°(X) of scalar 
functions defined and continuous on X, endowed with the norm 


(2.14) IIfllx = sup |f(2)| 
LEX 


of uniform convergence on X; a measure on X is then, by definition, a map 
uw of C°(X) into C which is linear and continuous, i.e. satisfies an inequality 


(2.15) MA) S M() IIfllx 


which allows one to pass to the limit under the { sign when integrating a 
uniformly convergent sequence of functions f, € C°(X) with respect to py. A 
measure is said to be positive if u(f) > 0 for every function f > 0. It is clear 
that formula (11) defines such a measure on T. 

In the case of the measure m, all that we have said in Chap. V trans- 
poses immediately to integration on T, starting with the notion of integrable 
function (Chap. V, n° 2); it will be the same for an arbitrary measure {4 once 
we have defined the integrable functions in this case. The proper theory of 
Fourier series uses the Lebesgue integral — for m or any other measure on 
T — and has, historically, constituted one of the principal justifications or 
motivations for it. Since we cannot yet do this here, we shall confine our- 
selves, in this chapter, without exceptions, to considering regulated, mostly 
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continuous, functions, when we integrate with respect to an arbitrary mea- 
sure: there is no benefit in complicating one’s existence in exploiting to the 
full the possibilities of the Riemann integral by circuitous and complicated 
methods when one can obtain much more complete results more easily using 
the Lebesgue integral (a principle of Dieudonné’s). 

We may also define distributions in the sense of Schwartz on T, as we 
shall see in n° 9. 


(vi) Invariance of the measure m on T 


For the measure m defined on T by (11), we have 


(2.16) ja au)dm(u )= f Hw) )dm(u) for every a € T; 


this is the analogue of the translation invariance 


(2.17) ij fet wde= / Oe 


of Lebesgue measure on R and of the analogous property 


Jf feraut ded = [ Pee aed 


on R?: the maps u ++ au, where |a| = 1 (geometrically: rotations about 
the origin), play the same réle in T as the translations x + a+ in R or 
(x,y) + (x@+a,y+b) in R?. The reader who knows what a group is (additive 
in the case of R or R?, multiplicative in the case of T) will understand. On 
the multiplicative group R* of nonzero real numbers the invariant measure 
is dx/|z| as one sees on making the change of variable x +> ax with a € R*. 

To prove (16), it is enough to reduce to (12) by putting u = e(t) and 
a=e(a) witha eR. 

One can show that, among the measures on T, the measure m is the 
only one that satisfies (16) and attributes the value 1 to the integral of the 
constant function 1. For this reason, one calls m the invariant measure on T. 
The relation (17) likewise characterises Lebesgue measure up to a constant 
factor. The measure m is also invariant under symmetries, i.e. satisfies 


(2.18) [rl ~") dm(u )= | tu )dm(u 


for any f. This follows from the corresponding property of Lebesgue measure 
on R: the change of variable t +> —t replaces the integral of f(t) on a period 
interval by the integral of f(—t) on the symmetric interval, so again on a 
period interval. Also f f(—t)dt = [ f(t)dt when one integrates over all of R. 

Finally, we shall need double integrals, for example @ propos the con- 
volution product on T. In Chap. V, n° 30, we showed that if J and J are 
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compact intervals in R and f(z,y) is a function defined and continuous on 
the rectangle I x J, then 


@19) ——fdu(x) f f(e,ydrty) = f arty) f fe.waute) 


for any measures yz and vy on J and J, the common value of the two sides being 
by definition the double integral ff f(a, y)du(x)dv(y) over I x J. Since, in 
the case of the invariant measure, integration on T reduces to an integration 
over [0,1], it is clear that, for every function f(u,v) defined and continuous? 
on T x T we will have 


ey) fdmn(u) f t(u,v)am(o) = f amv) f fu,)am(u), 


the common value being denoted by [f f(u,v)dm(u)dm(v). In the general 
case* of two arbitrary measures, it is necessary, to establish (19) on T, to use 
as in Chap. V, n° 30 partitions of unity on T in order to show that every 
continuous function on T x T is the uniform limit of finite sums of functions 
of the type g(u)h(v), with g and h continuous on T; the proofs are the same 
as in Chap. V: one replaces the intervals of R by arcs of the circle. You may 
even, if you think it worthwhile, use the diagram in Chap. V, n° 30, so long 
as you do not forget that, when working on T, the graph of a real function 
is drawn on the cartesian product T x R, i.e. on the surface of the vertical 
cylinder in R? having T as base. 


3 — Fourier coefficients 


The Fourier coefficients of a regulated function f(t) of period 1 will be de- 
noted 


A ——i—t a+1 é 
(3.1) f(n) = f f(t)en@at = i f(the-2""*at; 


3 A function f defined on T x T is continuous at a point (a,b) of T x T if for any 
r > 0, there exists an r’ > 0 such that 


{lu-—al <r’ & |b—v| <r'} = |f(a,b) — f(u,v)| <r. 


This is the general notion of continuity in a metric space if one defines the 
distance of two elements of T x T by d[(u’,v’), (uw”,v”)] = |u’ — u”| + |v! — 
as in the Appendix to Chap. III. The definition amounts to continuity on R x R 
on considering the function f[{e(s), e(¢)], which is periodic in s and t. 

Despite appearances, a measure on T is not a measure on I = [0,1], for 
C°(T) is identified with the vector subspace of C°(I) formed by the functions 
such that f(0) = f(1). But every continuous function f on I can be written 
f(t) = fo(t)+c(f)t, with fo “periodic” and c(f) = f(1)—f (0). If u is a continuous 
linear form on the periodic functions, one may then extend it to C°(I) by putting 
u(f) = u(fo) + y[f(1) — f(0)], where y is a constant. One might remove the 
ambiguity by agreeing to choose 7 = 0, but this is a little artificial. 
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it is sometimes useful to use a notation such as a,(f). In the “functions on T” 
version the formula becomes 


(3.1’) fen) = f Flayu-rara(y (n€Z) 


where m is the invariant measure defined above. If one uses the notation (wu) 
to denote a character ut u” of T, one may put 


(3.1”) Fo) = [ X@}Fwam(u) = (F x9, 


the scalar product of the functions f and y. 
The first relation to establish is the trivial but useful inequality 


(3.2) f(n)| < f |fwlam(u) = If lla < Wile. 


In fact, we shall soon prove much more: the series > | f(n)|? converges, so 
that f(n) tends to 0 when |n| increases indefinitely (n° 7). 

More generally one may define the Fourier coefficients of an arbitrary 
measure p on T by 


(1) a(n) = furdy(u) or aby = f XCwHayw) 


Compatibility with (1’) is obtained by associating with every regulated func- 
tion? f the measure f(u)dm(u) of density f with respect to the invariant 
measure ™m. 

If for example pz is the Dirac measure at the point® u = 1 of T, given by 
w(f) = f(1), then we have 


(3.3) fi(n) =1 for every n € Z, 


which shows that, in contrast to those of a function, the Fourier coefficients of 
a measure need not tend to 0 at infinity. In this case, one may only say that the 
function fi is bounded on Z since the existence of a bound’ |u(f)| < M.||f\l 
clearly implies |/i(n)| < M for every n. 


The notation® f(n) is intended to display the fact that the theory of 
Fourier series consists of associating to every function f on the multiplicative 


° In fact it would be enough for f to be absolutely integrable on T (i.e. on [0, 1] for 
example) in the sense of Chap. V, n° 22, which allows us to extend the definition 
(1) of the Fourier coefficients to this case. 

® Do not confuse this with the Dirac measure on R. The latter is a linear form 
on C°(R) while here we are concerned only with linear forms on C°(T). For this 
reason we resist the temptation of again writing 6 for the Dirac measure at the 
point u = 1 of T. 

” One almost always writes || f|| instead of || flr. 

8 It was introduced by André Weil, L’integration dans les groupes topologiques et 
ses applications (Hermann, 1940) in the framework of the most general version of 
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compact group T or, equivalently, to every periodic function on R, a function 
f on the discrete additive group Z, its Fourier transform?. Convaisely: one 
may associate to every function g on Z which tends rapidly enough to 0 at 
infinity a function g on T, namely the Fourier series 


(3.4) g(u) = S0 g(n)u" 


whose coefficients are the values of the given function on Z; this is the essence 
of the subject as its contemporary generalisations have shown. On the addi- 
tive group R, the Fourier transform associates to every absolutely integrable 
(or even Lebesgue integrable) function f a function 


(3.5) n= f 1) fpeani= [ en 22 F(x) da 


defined on the same additive group R; this also appears in the same general 
framework, also in the Fourier transform in R” or the theory of multiple 
Fourier series for periodic functions of several real variables, etc. 


The first fundamental problem of the theory is to decide whether every 
“reasonable” function f on T, or periodic on R, is represented by its Fourier 
series, i.e. if one has 


(3.6) fu) = So f(m)u” or fF) =U f(njen() 


This is the case, we shall see, if f is C'. When one does not know if the 
Fourier series of a function f converges and represents f it is prudent to 
confine oneself to writing something like 


(3.6°) fu) So f(nju" or ft) & DI f(njen() 


to avoid confusion (no connection with asymptotic expansions!). 
Note that, in (4), one adds over all the rational integers and not over N. 
Since 


(3.7) e,(t) = cos(2rnt) + isin(2rnt), 


harmonic analysis on commutative topological groups, invented independently 
at the same period, with better methods, by the Soviet school (D. A. Raikov) 
and, in other way by H. Cartan and R. Godement, Théorie de la dualité et 
analyse harmonique dans les groupes abéliens [= commutative] localement com- 
pacts (Ann. Ecole Norm. Sup., 64 (1947)), which expounds the whole topic in 
twenty pages. See Chap. XI, 87. 
In the classical theory, one speaks of the “Fourier transform” only in the case 
of R. But this notion applies to any commutative locally compact group (or even 
non commutative group, but this is far more complicated). 


Oo 
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the series (4) can always be put in the traditional trigonometric form 


(3.8) aio + S- bp cos(2ant) + cp sin(2mnt) 
1 


(take the terms with indices n and —n together), or, if the b, and cy are real, 


S¢ In cos2rn(t — wn)] 


with “phase lags” w,, and “intensities” [,, > 0 for n > 0, but using this form 
complicates the calculations; the formulae 


(3.9) en(z + Yy) = en(r)en(y), en(t) ze en(t)* =e_,(t), 
e,(t)eg (t) = €p+q(t) 


are simpler than the analogous trigonometric formulae and lend themselves 
to the generalisations to group theory!®. 

As we have already said elsewhere, one has always to observe that the 
series (4) being extended over Z and not over N, it has a meaning only if it 
converges unconditionally, i.e. if 


(3.10) S- \F(n)| < +00. 


This is the case for the functions f € C1(T) as we shall see in n° 8, but (10) 
is very likely to be false when one attempts to study more general functions. 
In this case, one gives a sense to the series by putting, by definition, 


N 
(3.11) S~ f(nen(t) = lim S— f(nen(t) = lim fy (t), 
Z —N 


—+00 


which considerably increases the chances of convergence and amounts, in 
fact, to considering the traditional series (8) and its usual partial sums f(t). 


10 If G is a commutative group endowed with a locally compact topology, one calls 
a character of G any continuous map y : G —> T such that y(uv) = x(u)x(v). 
It is clear that if y’ and x” are two characters, then so likewise is the product 
function x(u) = x‘(u)x”’(u); endowed with this multiplication, the set of the 
characters of G becomes a group; on endowing this with the topology of compact 
convergence one obtains a new commutative locally compact group, the “dual” 
G of G. Since there always exists a positive measure dm(w) on G invariant 
under the translations u +> wv, one may associate a “Fourier transform” f (x) = 
f x(u) f (w)dm(u) to every function f on G decreasing rapidly enough at infinity. 
One may then choose a positive invariant measure dm(x) on G so that conversely 
fw=f x(u) f (x)dm(x) under reasonable hypotheses on f. In the case where 
G =T, the characters are the e,(t), whence G = Z, and the last formula (9) 
shows that the “multiplication” on Gis precisely the addition on Z. 
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These are trigonometric polynomials since only a finite number of nonzero 
coefficients feature in fy. If one knew that for every continuous function f 
on T, the fr converged uniformly on T to f, one would obtain Weierstrass’ 
approximation theorem of Chap. V, n° 28 for periodic functions. Unfortu- 
nately not, even if one demands only simple convergence; to obtain uniform 
convergence when f is continuous it is enough to substitute for the fx their 
arithmetic means (f; +...+ fv)/N as we shall see (Fejér’s theorem), which 
will provide a proof — there are others — of the approximation theorem. 

In a general way, and as we have already said elsewhere, we have to warn 
the reader to exercise the most extreme prudence once he steps outside the 
framework of the C! functions: most of the statements that one might believe 
obvious are false and, when they are correct, they are never obvious. This is 
one of the charms of the theory for those whom it attracts, and the reason 
why it played such large role in the development of analysis during all the 
XIX*" century and a large part of the following one: when one does not 
understand one tries to understand, and this often brings one much further 
than one had imagined. One of the first traps of the theory is to believe that 
if a trigonometric series 


ag t+ S bn cos 2mnt + Cy sin 2rnt, 


with arbitrarily given coefficients converges for any t, than it must be the 
Fourier series of its sum. False: though a simple limit of continuous functions 
and so “measurable” in the sense of Lebesgue, the sum of the series can fail 
to be integrable. The first theorems proved by Cantor in 1870 say that, for a 
trigonometric series, 


(i) the coefficients b, and c, tend to 0 if the general term 6, cos(2ant) + 
Cn Sin(27nt) tends to 0 at every point of an interval I of nonzero length, 
and therefore if the series converges on J, 

(ii) if the series converges to 0 for every ¢ € R, then all the coefficients are 
zero (“obvious”, but try to prove it ...). 


It was in trying to weaken the hypothesis of the statement (ii), i.e. in trying 
to characterise the sets E C [0,1] such that 


f(t) =0 for every t€ FE = > a, = bn, =0 


(“sets of uniqueness”), that Cantor was led to construct more and more 
baroque sets in R, then to his theory of transfinite numbers. Do not confuse 
this, as we have already said, with the naive trivialities of Chap. I, which 
would not have led him to the edge of sanity if he had not been already pre- 
disposed. This kind of question continues to be the object of much research!!; 
most mathematicians and a fortiori users are happy with much less subtle 
results of universal use. 

' See for example J-P. Kahane and R. Salem, Ensembles parfaits and séries 


trigonometriques (Paris, Hermann, nouvelle éd. 1987) and J.-P. Kahane and P. 
G. Lemarié-Rieusset, Séries de Fourier et ondelettes (Paris, Cassini, 1997). 
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4 — Convolution product on T 


The invariance of the measure m under translation leads to useful formulae. 
For example, the left hand side of formula (2.19’) does not change if in it one 
replaces f(u,v) by f(u, av) for an a € T independent of v; in particular, one 
can, for each u, replace v by uv, or u—!v, since u is a variable independent 
of v, whence the formulae 


(*) ; f(u,v)dm(u)dm(v) = if i (u,u-*v) dm(u)dm(v), 
| f(u,v)dm(u)dm(v) = | f (uv, v~") dm(u)dm(v) : 


one integrates first with respect to u, makes the change of variable ut> uv!, 


then integrates with respect to v, finally one replaces v by v~!. Similar result 
in R: if f(a,y) is, for simplicity, continuous and of compact support in R?, 


then 
Jf sevacay = | f te +u.-vyteay 


if f(x, y)dady = 


= fay f fe.yde= [dy { fe-ywde (sing 2 + 2 y) 
= [ef fe -wulay = fe [flere —y)dy (using y+ —y) 
= ff e+ y,—waedy, 


The invariant measure allows us, as in R (Chap. V, n° 27), to define the 
convolution product!? 


(4.1) fxg(u )= fF (ww f (ww) g(v)dm(v) = / f(w)g (uw) dm(w) = 9 flu) 


of two regulated functions on T or, in the “periodic functions on R” version, 


(4.1’) fr alt) =f flt—s)g(s)ds= $ g(t s)f(s)ds, 


integrating over a period. The equality of the two integrals in (1) is obtained 

by means of the change of variable v + uv~! = w (or s + t—s), the 

composition of a translation v +> uv followed by a symmetry v — v1; 

12 A symbol such as f * g(u) denotes the value at the point u of the function f xg 
and replaces the expression (f * g)(u). 
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see (x). A convenient way to simplify the theoretical calculations on Fourier 
series is to remark that if y is a character of T, the convolution product 


(4.2) f * x(u) 


[x (uv~*) f(v)dm(v) = 
x(u) / XO) F(w)dm(v) = FOdx(u) 


l 


is the general term of the Fourier series of f. The relation (3.6) can then, 
when it is true, be written 


(4.3) f(t)=Sofren(t) or fu) = So fex(u) 


where, in the second case, one sums over all the characters of T. Symbolically, 
one may also write it as f = )> f xen = >> f x x, which has the advantage 
of not presupposing the mode of convergence that one chooses: simple con- 
vergence, uniform convergence, convergence in mean, etc. It is precisely the 
choice of the mode of convergence to make (3) correct that is the whole theory 
of Fourier series; (3) is always correct in the sense of distributions as we shall 
see, but the convergence of a sequence or series of distributions is, in prac- 
tice, the weakest invented from Newton to nowadays. (Paradoxically, this is in 
fact the interest of distributions: everything which converges in a reasonable 
sense, or does not converge, converges in the sense of distributions). 

The convolution product has properties similar to those!? obtained in 
Chap. II, n° 18, Example 3, for functions defined on Z: but the proofs are 
less easy. We shall restrict ourselves to functions which are regulated and so 
bounded; going further requires recourse to the Lebesgue integral and will 
be expounded in Chap. XI, n° 25 in the general framework of group theory 
— for this is a matter of group theory as the case of the convolution product 
on Z has already shown. 

First of all the inequality | f(wv~+)g(v)| < || f|| |g(v)|, valid for all u and v, 
shows that, always, 


If *gll <I Mall, SAI Mall 


We shall deduce from this that the function f x g is continuous. If f is con- 
tinuous this follows directly from Chap. V, n° 9 (Theorem 9, (i)) since we 
are integrating the continuous function f(uv~+) with respect to the measure 
g(v)dm(v). In the general case there exists a sequence (fr) of continuous 
functions (see the lemma in n° 8 below) such that lim ||f — fn||, = 0; then, 
by (4), 
lf *9— fn gl <MollF — falls» 

'3 apart from the existence of a unit element: this would be a function e(w) such 


that one had f f (uv*) e(v)dv = f(u) for any f; the only candidate is the Dirac 
“function”, which is a measure and not a function. 
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from which f*g is the uniform limit of the continuous functions f,*g, whence 
the result. 
Now let us establish the relations 


(4.5) fe(gth)=fxgtfxh, (fxg)*h=fx(gxh) 


for f, g and h regulated. The first is obvious. To obtain the associativity 
formula let us first consider an integral of the form 


(4.6) | fo (uv) v)dm(u)dm(v) 


where y is continuous and f and g are regulated. Theorem 10 of Chap. V, 
n° 9, and the invariance of the measure show that it is equal to 


J a(eramtny f olwe)t(updmn(u) 


whence the relation 

(4.7) ff few) flu)g(v)amu)am(v) = f g(2).f + a(2)dm(2) 
This done, let us consider the triple integral 

48) 1) = | ff eluvw) f(a)g(vyh(es)am(u)am(u)dmn(w) 


where f, g and h are regulated. Theorem 10 of Chap. V, n° 9, which is clearly 
valid for multiple integrals, shows us that on the one hand 


(9) =f rwyam(w) ff our) su)g(udm(u)am() 


f h(w)den(w) / plow) fxg(edm(2) by (7) 


J [eew) £4 ola) -hwa)din(x)am(w) 


= / y(2).(f 9) * h(z)\dm(2) 


I 


I 


applying (7) again, to the functions f * g and h. 


But one can also calculate I(y) alternatively as 
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= 
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eed 
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SS 
a. 
3 
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Sa 
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S 
Sy 
a 
+ 
= 
= 
3 
or 
St 
cs) 


= / fo ‘ie HN GRAD ACDAAD 


by applying (7) now to f and gxh. Comparing rae a we see that the 
function F = f x (g*h) —(f*g)x«h satisfies [ p(z)F(z)dm(z) = 0 for every 
continuous y on T. Now F is itself continuous. ae may therefore choose yp 
to be the conjugate of F, whence f |F(z)|?dm(z) = 0 and F = 0 (Chap. V, 
n° 7, Theorem 7), which proves associativity for regulated functions, if not 
yet for all the integrable functions of the Appendix to Chap. V. 

Along with (4) for the uniform norm, we also have 


(4.9) lf * lla S lf lla-llglls. 


Replacing f and g by their absolute values does not change the right hand 
side, but increases the left hand side, since 


Lf x g(u I< flew 1)|.\g(v)|dm(v) = [F] * lgl(w)s 


so it is enough to prove (8) for positive f and g. Relation (7) with yp = 1 
shows that 
If * oll, = WFlla Iglls 


in this case, qed. 
The Fourier series of a convolution product is calculated very simply from 
the formula 


(4.10) f x g(n) = f(n)g(n). 
To see this, use the associativity of the convolution product: 
fxg(njen = (fxg)*en=f*(gxen) = 


I 


f *(G(n)en) = G(r) f xen = G(n)f(n)en. 


For the inverse Fourier transform, which starts from a function f(n) in L+(Z), 
ie. such that 5>|f(n)| < -+oo, and leads to a function 


(4.11) fu) =o fu 


one has likewise, for the convolution product on Z (Chap. II, n° 18, Exam- 
ple 3), 
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fxg S* fx g(nju" = S~ f(p)g(n — p)u" =~ f(p)g(qvuP*4 = 
- = rote Jubut = > f) een ut = fu)g(u). 


The Fourier transform thus interchanges convolution products and ordinary 
products. 

This last result is particularly obvious in the framework of measures. Let 
ps and v be two measures on T and let us calculates the product 


SS 
+ 
=e 
= 
lI 


I 


(4.12) AOL?) ¥ xCu)au(u) _ x@o)dv(v) = 


= / [xox (u)dv(v) = 
= ff XemFau(uyav(e) 


of their Fourier transforms. This leads us to consider more generally the map 


(4.13) Keri / / f(uv)du(u)dv(v) 


of C°(T) into C; this is clearly a linear form on C°(T), and it is continuous: 


fl< | [ano 


Consequently, \ is again a measure which one calls the convolution product 
of the measures and v, notation A = ux v. Clearly X is positive if 4 and v 
are. With these conventions the formula (12) can be written 


[ seeyar(a]] < meaner it 


(4.14) fux)o(x) = AO) where A = pv. 


One finds (4) again on considering the measures du(u) = f(u)dm(u) and 
dv(u) = g(u)dm(u), as the reader can easily verify. 

There is no simple formula analogous to (1) for defining or calculating 
the convolution product of two measures; the simplicity of the definition (13) 
shows once more the advantage in defining measures as linear forms on the 
continuous functions and not starting from a function of sets. 


5 — Dirac sequences in T 


As in R, the convolution product is linked to Dirac sequences on T, formed 
by regulated functions y,(u) such that 


(5.1) f(1) = lim / f()en(u)den(u) 


for every regulated function f continuous at the point u = 1. 
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The conditions to impose on them are the same as in Chap. V, n° 27. 
The first is that 


(D 1) ea =1 for every n; 
and then f(1) = f f(1)¥n(u)dm(u), whence 
(5.2) Fi wen(u)dim(a) ~ £0] < f 140) = FA) ben(uplam(u) 


Let us take an r > 0, and, on the right hand side of (2), distinguish the 
contributions of the arcs |u—1| < 6 and |u—1| > 6 of T for some 6 > 0 yet to 
be decided. Since f is continuous at the origin, one can, for r given, choose 
6 so that 


(5.3) ju-1])<d = |f(u) -fd)| <r. 


If one assumes that 
(D 2) sup f |pn(u)|dm(u) = M < +00, 


the contribution of this “small” arc to the total integral is < Mr. On the 
“large” arc |u — 1| > 6 we have |f(u) — f(1)| < 2\|f|| since f is bounded; 
the corresponding integral is thus, up to a factor 2]||f||, bounded by that of 
\y~n(u)|. Assume now that, for any r and 0, there exists an integer N(0d,r) 
such that 
n> N(6,r) => lyn(u)|dm(u) <r 
ju-1|>6 


in other words that 


(D 3) lim l~n(u)|dm(u) = 0 for every 6 > 0. 


The preceding arguments now show that for every r and every 6 satisfying 
(3) we will have 


(5.4) fue Fu) — FQ [yn(w)| dm(u) < (M+ 2I) flr 


for every n > N(6,r), whence (1). 

The conditions (D 1), (D 2) and (D 3) may therefore be taken as the 
definition of the Dirac sequences on T. 

The most frequent case is that where the y,, are all positive; (D 1) then 
implies (D 2) with M = 1. To achieve (D 3) it is simplest to assume that for 
every 6 > 0 the functions y,, converge uniformly to 0 on the arc |u—1| > 6 
of T, in other words that 


(5.5) lim y,(u) =0 uniformly on every compact K Cc T — {1} 


272 VII — Harmonic Analysis and Holomorphic Functions 


since the points of a compact subset of T not containing the point u = 1 
remain “standing off” from it. 


If one applies (1) to the function ur f (vu) for a given v € T, one 
obtains more generally the formula 


(5.6) flv) = tim f f (vuT") Gn(u)dm(u) = lim f *« yp (v), 
so long as f is assumed continuous at the point v. 


In practice, one needs a more precise result. 
Lemma. If f is continuous on an open arc J of T then 
(5.7) f(v) = lim f * gy (v) 
uniformly on every compact Kk C J, for every Dirac sequence on T. 


Applied to the function ur f (vu), the relation (4) shows that, if f is 
continuous at the point v, 


(5.8) If * en(v) — fFv)| S$ (M+ 2ilfl)r 


for n large; but to obtain uniform convergence on KC one has to find an integer 
N such that (8) will be valid for n > N for all the v € K simultaneously. 
Now, for r and v given, the integer N depends only, as we have seen, on the 
choice of a 6 such that 


(5.9) |f (vu-*) — f(v)| <r for |ju-1] <6. 


So it all reduces to showing that, for any r, there exists a 6 satisfying (9) 
for all v € K simultaneously. Since vu~' is “close” to v for u “close” to 1, 
we are manifestly dealing with a uniform continuity property of f. 

Assume now that f continuous on the open arc J of T and let K bea 
compact arc contained in J (figure 1). Since T— J and K are compact and 
disjoint, the distance d(T — J, K) = d is strictly positive. Since 


|vu-* — | = |v — vu = |1 — a | 
for every u € T, we see that 
(ve K) & (lu-1]<d) Sw ted. 


For every 6 < d, the set K(6) D K of points vu~! with v € K and |u—1| < 6 
is thus contained in J; moreover it is compact like K and the arc |u—1| < 46 of 
T (use BW or, more elementarily, define the arcs of T by inequalities between 
the polar angles of their points). But since f is continuous in J it is uniformly 
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ON 


Fig. 1. 


continuous on K(d). So we see that for any given r > 0 there exists a 6 > 0 
such that (8) holds for all v € K, qed. 

We leave to the reader the task of verifying, as in Chap. V, n° 27, that if 
the yy are indefinitely differentiable, then so likewise are the functions fxn. 
This follows in the usual way from the standard theorem on differentiation 
under the [ sign (Chap. V, n° 9). 
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§ 2. Elementary theorems on Fourier series 


6 — Absolutely convergent Fourier series 


Almost all the simultaneously simple and important results in the theory of 
Fourier series, especially those which can be generalised, can be deduced from 
one fundamental statement: 


Theorem 1 (Weierstrass). Every continuous periodic function is the uni- 
form limit of trigonometric polynomials. 


Instead of proving this now, we shall, in this §, show how one may use it; 
we shall present an “elementary” proof later (n° 12, Theorem 8 and n° 23), 
more complicated than the general “abstract” Stone-Weierstrass theorem of 
Chap. V, n° 28. 

The most immediate consequence of Theorem 1 is the following: 


Theorem 2. Jf f is a continuous function on T such that S>|f(n)| < +00, 
then 


f(u) = SS f(n)u" for every u € T. 


Let us denote the right hand side by g(u). This is the sum of an absolutely 
convergent Fourier series, whence (Chap. V, n° 5) g(n) = f(n) for every n. 
Putting f = g +h, we see that all the Fourier coefficients of the function h 
vanish. 

The relation h(n) = 0 for every n means that, with respect to the standard 


scalar product 
(fa) = f fwatajam(u 


of two functions on T, the function h is “orthogonal” to all the characters 
ut> u” of T : (h|x) = 0. It is therefore also orthogonal to every linear 
combination of a finite number of these functions, i.e. to every trigonometric 
polynomial p. 

Now h is continuous like f (by hypothesis) and g (the sum of a normally 
convergent series of continuous functions). By Theorem 1, there therefore 
exists a sequence (p,) of trigonometric polynomials which converges to h 
uniformly on T. Since the functions h(u)p,(u) convergent to |h(u)|* uniformly 
on T we deduce that 


J rw)Pam(a) — lim f h(u)pp(ujdm(u) = lim(h|p,) = 0. 


The function |h(u)|? being continuous and positive, we have h = 0 (Chap. V, 
n° 2), whence f = g, qed. 

Here is an easy consequence of Theorem 2: for a continuous function f 
to be represented by an absolutely convergent Fourier series, it is necessary 
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and sufficient that 3 |f(n)| < +00. The condition is sufficient by Theorem 2. 
We know on the other hand (Chap. V, n° 5) that if a function f, necessarily 
continuous, is the sum of an absolutely convergent Fourier series, then the 
coefficients of the latter must be the numbers f(n); the one and only series 
which represents f is then the Fourier series of /f. 

The theorem of Cantor mentioned above shows much more: two distinct 
trigonometric series (i.e. not having the same coefficients) and everywhere 
convergent (absolutely or not) cannot have the same sum. 


7 — Hilbertian calculations 


Let us denote by H the complex vector space (of infinite dimension) of reg- 
ulated functions on T and let us endow it with the usual scalar product 


(7.1) (flg) = i f(u)g(a)dm(u). 


It has the same properties as in Chap. V, n° 3: it is a linear function of f for 
g given, we also have 


(7.2) (of) =p 
and finally 
(7.3) (lf) = / If(u) Pdm(u) > 0 


for any f. As in Chap. V and, more generally, as in every pre-Hilbert space 
(Appendix to Chap. III), it therefore satisfies the Cauchy-Schwarz inequality 


(7.4) FIM? < FI AGL 9)- 
We deduce that the expression 


1/2 
(7.5) fle = U1)? = ( / f(u)Pain(u)) 
has the properties 


(7.6) |Afll2 = ALM fll2, Wf + glle S fle + Ilglle. 


of a “norm” on the vector space H of regulated functions on T, except that the 
relation || f||2 = 0 shows only that the set {f(u) 4 0} is countable (Chap. V, 
n° 7, Theorem 7) and not that f = 0. This is not important since two 
regulated functions which are equal outside a countable set have the same 
integrals and so the same Fourier series. 

As in the case of the norm of uniform convergence, the second relation 
(6) shows that the expression 
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dal .9) = Mf — alle = ( f I) — oa? ant " 


is a “distance” between f and g (distance in quadratic mean). 
This done, one says that two functions f and g are orthogonal if (f | g) = 0, 
a concept we have already used above. We now have the Pythagoras relation 


(7.7) If + olla = IFS + Walls 


since (f+ 91 f+9) = (F1f)+(f 19) +914) + (gl). This extends to a finite 
sum of pairwise orthogonal functions f;: indeed 


(Als) =i =U if GLA) =O foro 43. 


In particular, (e, |e,) = 0 or 1, whence 


(7.8) eS App | S- byeq) = S> apbp 


at least when dealing with finite sums, i.e. trigonometric polynomials. 

With these definitions, the Fourier coefficients of a function f are, as 
we have already seen, given by f(n) = (f|e,) where e, is the exponential 
function e,,(£), in version R, or u”, in version T. If we consider the partial 
sum 


(7.9) fv= >) f(ne 


[n|<N 


of the Fourier series of f, we have (fy |en) = f(n) = (fen) for |n| < N 
by (8), and so (f — fy |en) = 0. The function f — fy being orthogonal to 
the exponentials e,, such that |n| < N it is also orthogonal to every linear 
combination of the these, and in particular to the function fy itself. Since 
f =(f-— fn) + fn, Pythagoras’ theorem shows that 


(7.10) (f\f) = (fw | fu) + (f — fv | f — fx) = Cf | fv). 
But by (8) 
(7.11) (ful fx) = > |F(n yf. 


|n|<N 


ee oy 
The partial sums of the series with positive terms )> | f(n ) are therefore 


bounded above by (f | f); they consequently converge, and we have 


(12) ff] < 1A = f lroPam(w) = 6 \pPat 


for every regulated function on T. 
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These calculations, which generalise the traditional ones in R® starting 
from the “unit vectors” of a system of rectangular coordinates, are valid 
in every pre-Hilbert space. If for example you have a sequence of contin- 
uous functions P,(t) on a compact interval J Cc R — these are often, in 
practice, polynomials or the solutions of differential equations — satisfying 
J Px(t)P,(t)dt = 0 or 1 according to whether k 4 h or k = h, and if for every 
continuous function f in I you put 


(7.13) ent) = ff SPaCBee 


then you obtain the inequality 37 |en(f)|? < f |f(#)|2dt. In the good cases, 
one hopes — while there is life there is hope — to obtain not only an equality 
but even an expansion 


(7.14) FOS > Grew) 


in a convergent series. The Fourier series were, historically, the first case to 
present themselves, and have, of course, inspired the many later generalisa- 
tions of which we have just described the simplest. 


8 — The Parseval-Bessel equality 


The inequality (7.12) is in reality an equality as we have seen in Chap. V, n° 5 
in the simple case of absolutely convergent Fourier series. We shall now prove 
this for every regulated function, using Weierstrass’ approximation theorem. 

By (7.10) and (7.11), it reduces to proving that || f — fi ||, tends to 0, i.e. 
that 


(8.1) lim [ \ro- s- finjen(t)| at = 0, 


N-oco 
In| SN 


but to write integrals of this kind explicitly would be the best method of not 
understanding the proof, and to avail oneself of Knuth’s software in vain. 

In the complex vector space H of the preceding n°, let 7{y be the set of 
trigonometric polynomials involving only the e,,, |n| < N, in other words, the 
vector subspace generated by these 2N + 1 functions; it contains fy. Since 
f — fn is orthogonal to the e, € Hy, it is orthogonal to every p € Hy as 
we saw above. On writing f — p = (f — fw) + (fw — p) and observing that 
fn —p€ Hn we then have 


(8.2) G=gli =p = Fain it = in hn Helin P) 
> (f-— ful f—fw) 
for any p € Hy. In other words, fy is the point of the vector subspace Hn 


lying at the minimum distance from f, which is plausible (figure 2) since it 
the “orthogonal projection” of f onto Hy. 


278 VII — Harmonic Analysis and Holomorphic Functions 


Fig. 2. 


It follows that, to establish (1), it is enough to prove that, for every r > 0, 
there exists a (indefinite article) trigonometric polynomial p such that 


(8.3) (f-—plf—p) <r; 


such a polynomial p belongs of course to every Hy» of sufficiently large index, 
so that, by (7.10), 


(8.4) 0< (f/f) - (nl fu) =(f-fulf—fw) <(f-plf-p) <r 


for N large, which will establish the Parseval-Bessel equality. 

Relation (3) is a theorem on approximation by trigonometric polynomials; 
but instead of measuring the “distance” between two functions f and g by the 
uniform convergence norm — which is doomed to failure if f is not continuous 
—, one measures it by the function ||f — g||z2 which leads to convergence in 
quadratic mean, while on using the distance 


renee if LF(u) — g(u)|dm(u) 


one obtains convergence in mean (Chap. V, end of n° 4), much less easy to 
manipulate than the preceding in this context. 

Let us return to the proof of (3). There is no problem if f is continuous: 
Weierstrass provides a trigonometric polynomial p such that | f(u) — p(u)| < 
r for any u, which is incomparably better than (3). In the general case, 
it reduces to showing that f can be approximated in quadratic mean by 
continuous functions g, for if || — gllz2 <r and ||g — pll2 <r, it follows that 
|| f —pllo < 2r. To do this it is enough to have a general result which could also 
well be obtained from the definition of the integrable functions at Chap. V: 


Lemma. Let f be a regulated function on a compact interval I C R (resp. 
on T). Then, for every r > 0, there exists a continuous function g on I (resp. 
T) such that ||f — gll2 <r, or \|f —glli<r. 
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Fig. 3. 


One may assume J = [0,1]. There exists a step function y on I such that 


If) -—eO@l sr 


for any t € I, whence 
lf —¢lla sr 


and the same inequality for the other norm. It therefore suffices to estab- 
lish the lemma for the step function y. Figure 3 indicates the method: one 
replaces y by a continuous piecewise linear function equal to y except on 
neighbourhoods of the discontinuities of y; if one puts M = sup |y(t)| and 
if y has n discontinuities in [0,1], one may choose the n intervals on which 
one modifies it so that their lengths are less than r/Mn; the contribution 
of such an interval to the integral of |f — y| is then less than the length 
r/Mn of the latter multiplied by the maximum of |f — y| on this interval, 
so to Mr/Mn = r/n, whence, for the n intervals which actually contribute 
to the integral of |f — y|, a total contribution of < r. For approximation 
in quadratic mean, one chooses intervals of length < r?/M?n?. The case of 
periodic functions is treated similarly, arguing on T instead of on J. 

[Artificial proof of too limited a result. In the Lebesgue theory (Bourbaki 
model), an integrable (resp. square integrable) function is, almost by defi- 
nition, a limit in mean (resp. in quadratic mean) of continuous functions. 
There is nothing else to prove, except perhaps the integrability of the regu- 
lated functions, a “result” whose proof takes three lines and which, at this 
level, is totally uninteresting. ] 

However it may be, these roundabout procedures provide a result, which, 
fundamental though it is, is still too limited: 
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Theorem 3 (Parseval-Bessel'’). Let f be a regulated periodic function. 
Then the series )~|f(n)|? is convergent and 


(35) = STAMP =r = [ite )Pam(u )= pire )Pat, 


(8.5’) lim |e — > fn) en(t)| dt t=0. 


N—-oo 
|n|<N 


Corollary. Let f and g be two regulated periodic functions. Then the series 
S> f(n)g(n) converges absolutely and 


(3.6) Sl A(n)am) = Fla) = | swat f(u)gtu)am(u) = $ soigBat 


The proof consists of using the algebraic identity 


(8.7) 4(flg) = (ftglf+9)-(f-glf—-9)4 
+4i(f +ig| f +ig) — if —ig| f — 4g) 


which follows formally — calculate mechanically by expanding the squares 
without writing any integrals — from the fact that the scalar product (f |g) 
is, for g given, a linear function of f and satisfies (g| f) = (f |g); the relation 
(7) generalises the identity 


(8.8) Aud = |u+v|? — ju —v|? + lust tv|? — du — tv? 


between complex numbers. Having done this, one applies Parseval-Bessel to 
the functions f +g, f—g, f+ig and f —ig that appear in (7), and applies 
(8) to u= f(n) and v = g(n). Next one remarks that the series S> f(n)g(n) 
is a linear combination of four absolutely convergent series, so converges 
absolutely, and that its sum is the scalar product (f |g). 


‘4 This equality was published by M.-A. Parseval (1755-1836) in 1805 in the 
Mémoires de I’Académie des Sciences; Parseval considered two series of the form 
P(t) = Yoant” and Q(t) = 5 bzt~” (summing over N), and remarked, up to 
notation, that P(t)Q(t) = dient” (summing over Z) with co = >> adnbn, and 
then considered it obvious that 


Tebn=2 f [P(erjaler) +P(e")a(e")] a 


just a simple formal calculation. The astronomer Friedrich Wilhelm Bessel (1784— 
1846), ultrafamous for his work in celestial mechanics, published the inequality 


- 2 
> |Fen)| < ||f||? in a memoir of 1828 on periodic phenomena, where he used 


the expansion in Fourier series without reference to its proof or to problems of 
convergence. I. Grattan-Guinness, Joseph Fourier 1768-1830 (MIT Press, 1972), 
pp. 240 and 376. It was at the end of the century, with the appearance of the 
first works on “functional” Hilbert spaces, that the theorem would be proved 
correctly and its importance highlighted. 
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Another corollary: assume that the Fourier series of a regulated function f 
converges absolutely, i.e. that 


S- \F()| < +00 


and let g be the sum of the latter. By n° 5 of Chap. V, we must have 
g(n) = f(n), which suggests that g = f “more or less”; this is precisely what 
we proved in Theorem 2 assuming f continuous. 

In the general case, let us again consider the function h = f — g. It is 
regulated and its Fourier coefficients h(n) = f(n)—G(n) are all zero. Parseval- 
Bessel then shows that [ |h(t)|° dt = 0 and thus that h(t) =0 except maybe 


on a countable set D of points (Chap. V, n° 7, Theorem 7). We thus see that 


FO =D flmjen(t) 


for every t ¢ D. 

Exercise — Consider a sequence of polynomials P,,(t) on a compact interval 
I, satisfying [ P,(t)P,(t)dt = 0 or 1 and such that d°(P,) =n for every n. 
Show that every continuous function f on I is the uniform limit of (finite) 
linear combinations of the P, and deduce that f | f(t)|? dt = 37 |en(f)|> [no- 
tation (7.13)]. 

The preceding results show that if one associates to each f its Fourier 
transform f : n+ f(n), one obtains a linear map of H into L?(Z) which 
preserves scalar products. The reader who would like to understand the differ- 
ence in effectiveness between the integrals of Riemann and those of Lebesgue 
may ask himself the question of whether this map is surjective. Negative re- 
sponse chez Riemann, positive chez Lebesgue, whose theory there found one 
of its first great successes. 

To understand the problem, let us start from a function c(n) in L?(Z); 
we are to find an f € H such that f(n) = c(n) for every n. If f exists we 
must have 


(8.9) fv(u)= S> e(n)u” and lim||f — fr|lp = 0. 


|n|<N 


Now the fy form a Cauchy sequence in H, since for p < q one has, by (7.8), 


(8.10) llfp—fall2= do lem), 
p<\nrl<a 

a result arbitrarily small for p large since $*>|c(n)|?_ < +oo. The question 
asked is thus to decide if the convergence in quadratic mean is, in H, guar- 
anteed by Cauchy’s criterion, in other words: is 7 a complete space in the 
sense of the Appendix to Chap. III? Negative response in Riemann theory, 
positive (Riesz-Fischer theorem) in the Lebesgue theory where one considers 
the much more general “square integrable” functions. This is one of the many 
reasons which show that one can probably never surpass the present theory 
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of integration — that of Lebesgue — or, to be more prudent, that it is not 
worth seeking to do this if one is not interested in the ultra fine and ultra 
specialised mathematics of Baire’s successors. 


For lack of great “modern” results, i.e. not more than a century old, one 
may always turn back to Euler and Fourier. 


Example 1 (Fourier). Consider the periodic function equal to ¢ for |t| < 4; 
the values at the end-points are immaterial. Integrating by parts, we have, 


for n 4 0, 
i 1 
sie ie dt 
n(t)dt; 
_1  2min nt) 


2 2 


te, (t) 
—27in 


fie i entOde = 


i 
2 


the Hien integral is zero since ey, is orthogonal to eo; since e,(t) = (—1)” for 
= +5, it follows that 


Es, 
FO) =0,  f(n) = (-1)"**/2min. 
The integral of ¢? being equal to 1/12, one finds the relation 
S71/42?n? = 1/12 


where the sum is taken over all nonzero n € Z. Whence again the relation 
Xin? = 77/6: 


Example 2. Consider the function of period 1 such that 


, 1 
(js er" for |t| < > 


where z is a complex number, not an integer, since otherwise the interest of 
the problem evaporates. We have 


3 2ni(z—n)t 
f(n) = ii eamilemmtgy — CUE | 


Qri(z —n)|_ 


i 1 
2 2. 


since, for every \ € C, the derivative of e** is \e*! (Chap. IV, n° 10, obvious 
since e* = > \"t"/n!). Whence 


(8.11) f(n) = (-1)" sin r2/n(z — 1). 


Considering now the function g(t) = f(—t), on passing to the T interpretation 
and bearing in mind the symmetry of the invariant measure we have 


(12) gin) =f FU Huram(w — Flaju"am(u 


=f FjaFam(u) = 
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The last corollary then shows — a general result of course — that 


(8.13) D fin? = f fo s—var, 


In the present case we have f(t)f(—t) = 1 for any t, whence, by (11), the 


identity 
+2 
sin* 1z 
l= = 
se 


or, replacing z by z/z, 


(8.14) : -l oa 
Z 


sin? z 


We have already obtained this formula in Chap. II, n° 21, by formally differen- 
tiating the expansion of cot z in series of rational fractions, then in Chap. III, 
n° 17, Example 4 by more orthodox arguments . 


In his memoir on the propagation of heat, Fourier calculated similar ex- 
pansions; he considered for example the function of period a (and not 27) 
equal to sin x (or to cos x) between 0 and 7 and expanded it as a function of 
period 27. He also considered the function of period 27 equal to cos x between 
—1/2 and 7/2 and zero between 7/2 and 37/2, etc. These examples were par- 
ticularly bold at the time, since they yield a series of the form > a, cosnx 
whose sum is equal to cos in the first interval and to 0 in the second, i.e. 
a series of analytic functions whose sum is not analytic!®; Lagrange, who 
had all the same met “Fourier” series in 1759 @ propos the vibrating string 
problem but rejected them because of their periodicity and who attempted to 
found all of analysis on power series, criticised Fourier’s memoir briskly. The 
reader will easily find the coefficients in these formulae and may be interested 
to trace the graphs of these bizarre functions, as Fourier himself did. 


9 — Fourier series of differentiable functions 


In the sequel we shall write C?(T) for the set of periodic functions of class 
CP; and D(T) = C™(T) as in R. As we shall see, Theorem 2 always applies 
to the functions of C1(T). Let us first make several remarks on the formula 
for integration by parts. 

This is particularly simple in the case of two periodic functions f and g 


of class C! in R. When one integrates fg! + f’g over an interval [a,a+ 1] in 
1 although the problem of vibrating strings had already suggested this kind of 
phenomenon to d’Alembert, Euler and Daniel Bernoulli, who did not pursue it 
(and argued about this subject). One may find Fourier’s text, explanations and 
a biography of the prefect of the Isére, a position he occupied while writing his 
memoir, in I. Grattan-Guinness, Joseph Fourier 1768-1830 (MIT Press, 1972). 
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R, the term f(a+1)g(a+1) — f(a)g(a) cancels to zero because f and g are 
periodic. Writing generally f’(u) for the function which, on T, corresponds 
through f’(e(t)) = f’(t) to the periodic function’® f’(t) on R, we then have 


(9.1) J featwamn(u) = = f pa)g'(wpdin(u) 


or, in the R version, 


(9.2) f Falta =— $ Fg! wat. 


This result extends to the periodic functions which are primitives of regu- 
lated functions, but this requires some explanations. Periodic or not, a func- 
tion f is, on R, a primitive of a regulated function f’ if (i) f is continuous, 
(ii) f admits right and left derivatives at each point t € R, equal to the 
limits f’(t+) and f’(t—) of f’; the derivative f’ is thus periodic if f is. The- 
orem 12 bis of Chap. V, n° 13, ie. the FT, being valid for the primitives of 
regulated functions, one has 


F(1) ~ (0) = f F(a 


so that the mean value of f’ is zero if f is periodic. Formula (2) remains 
valid since, if f and g are periodic primitives of regulated functions f’ and 
g', necessarily periodic, the function fg is manifestly a periodic primitive of 
f'g + fg’; since fg is periodic, the integral of f’g + fg’ over a period is zero 
as we have just seen, whence (2). 

One must not believe that a regulated periodic function f always admits a 
periodic primitive. If indeed — the only possibility up to an additive constant 
— one puts 


ri = [peas 


as in Chap. V, n° 13, it is clear that F is periodic if and only if the mean 
value of f is zero: write that F(1) = F(0). One could adopt the preceding 
formula for t € [0,1[ and define F on R by periodicity, but then one would 
have 


F(1-) — F(V+) = F(1-) — F(0+) = lim[F( - «) — F(e)] = [ f(t)dt, 
0 


whence a discontinuity for t = 1 and more generally for t € Z; not be- 
ing continuous, F' cannot be a primitive of f. In a case of this kind, one 
has to add to the right hand side of (2.20’) a term equal to the difference 


16 We have f’(u) = 2ni. lim[f(uv) — f(u)]/(v — 1) as v € T tends to 1. 
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7 in calculation ... 


f(1—)g(1—) — f(0+)g(0+); a notorious source of errors! 


This done, let us consider a function f € C!(T) and write Df = f’ for 
its derivative, a periodic continuous function. An integration by parts then 
shows, by (2), that 


: 1 
| Die dt = 2nin f file dt 
y 0 
Le. 


(9.3) Df(n) = 2ninf(n). 


This calculation is again valid if f is a periodic primitive of a regulated 
periodic function, as we have seen above. It does not apply to Example 1 of 
the preceding n°, for the periodic function equal to t on [—5, $[, not being 
everywhere continuous, is not a primitive on R. 

Now we know that the series )* |DF(n) [F converges since Df is regulated; 
so likewise is the series }>1/n?. The series > |DF(n) /n| is therefore ab- 
solutely convergent (Cauchy-Schwarz inequality for series). But Df(n)/ n= 
f(n) up to a constant factor. Consequently: 


Theorem 4. Let f be a periodic continuous function, the primitive of a 
regulated function on R (for example, a periodic function of class C! on R); 
then the Fourier series of f is absolutely convergent and 


f(t) =o flnjen(t) 
for anyt ER. 
If f is of class C? one may iterate (3) and obtain 
D?f(n) = (2nin)? f(n), 


and so on. 
The Parseval-Bessel inequality now shows that 


(9.4) 2s In? fim)) < +00 
if f € C?(T), and a fortiori 


'” Pay attention to the fact that the f € C?(T) must be of class C? on R and 
not only on a period interval such as [0,1], since this last property is compatible 
with the existence of discontinuities at 0 and 1 of the derivatives of the periodic 
function considered. For a function f of class C? on [0,1] to be extendable to a 
periodic function of class C? on R it is necessary and sufficient that f(0) = 
f (1) for every k < p. 
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(9.5) f(n) = o(1/n?). 

One may wonder whether, conversely, these properties characterise the 

Fourier coefficients of functions of class C?; the answer is negative: for p = 1, 

the relation (4) is satisfied by every primitive of a regulated function, which 

allows many discontinuities in the derivative. But it is worth looking closer. 
First, (3) shows that if f is a periodic primitive of a regulated function 

Df, then the Fourier series 


(9.6) S* Df (njen(t) ¥ ¥ 2minf(n)en(t) 


of Df is obtained as if one may differentiate that of f term-by-term, even 
though the general theorem on term-by-term differentiation (Chap. III, n° 17, 
Theorem 19 and Example 2) does not necessarily apply here: the discontinu- 
ities of Df can prevent its Fourier series from converging uniformly or even 
simply (whence the * sign). 

If, however, the right hand side of (6) converges absolutely for a given 
regulated periodic function f, i.e. if So |nf(n)| < +00, a more restrictive 
condition than f(n) = o(1/n), then a fortiori S>|f(n)| < +00; one may 
then, as we have seen in n° 8, assume that 


f() = 5 f(njen(t) 


everywhere by modifying f on a countable set; since the derived series con- 
verges uniformly, we conclude that f is differentiable and that Df(t) = 
S> 2rinf(n)e,(t) is a continuous function: the function f is thus of class Ct. 
More generally, if the Fourier coefficients of a function f satisfy 37 |n? f(n)| < 
+oo, the function is of class C?: iterate the argument. 

The case where p = ov, ie. of the functions f € D(T), is simpler. If f is 
C@, in which case (5) applies for any p to the Fourier coefficients of all the 
successive derivatives of f, one may differentiate the Fourier series of f term- 
by-term ad libitum and obtain series which are all normally convergent and 
represent the successive derivatives of f; note that, except for that of f, they 
have no constant term. If, conversely, one takes coefficients c(n) satisfying 
(5) for any p and if one puts f(t) = >> c(n)e,(t), an absolutely convergent 
series and so the Fourier series of f, it is clear that the products n"c(n) again 
satisfy (5) for any r and that the series obtained on differentiating the series 
>> c(n)en(t) formally r times will converge normally; the standard theorem 
on term-by-term differentiation (Chap. III, n° 17, Theorem 19) then applies 
to the series )> c(n)e,(t): f is a C™ function of which c(n) are the Fourier 
coefficients. In conclusion: 


Theorem 5. Let c(n) be a scalar function on Z. For there to be a function 
f € C%(T) such that f(n) = c(n) for every n it is necessary and sufficient 
that c(n) = O(1/n”) for every p € N. One may then differentiate the Fourier 
series of f term-by-term any number of times. 


A function c on Z satisfying (5) for every p is said to be of rapid decrease. 
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10 — Distributions on T 


The identification of the functions on T with the functions of period 1 on R 
has allowed us to define the spaces C?(T) in an obvious way for every p € N, 
and also D(T) = C®(T). As on the Schwartz space D(R) (Chap. V, n° 34) 
one may define the norms 


(10.1) ell = Ill + Dell +... + |D* al] 


on D(T), and the distances 


(10.1’) d.(p, b) = lle — dll, 


where ||y|| = sup |y(w)| denotes the norm of uniform convergence on T (or, 
in terms of periodic functions, on R) and where the D'y = vy") are the 
successive derivatives, again periodic, of the function y. A concept of conver- 
gence is associated with these norms: a sequence y, € D(T) converges to a 
yp € D(T) if limd:(y, y,) = 0 for every k, in other words, if for every k > 0 
one has lim D*y,, = D*p uniformly on T. This is the mode of convergence 
which allows us to differentiate the given sequence term-by-term ad libitum, 
to calculate the derivatives of the limit. 

This said, a distribution on T is, as on R, a linear map T : D(T) — C 
which is continuous in the following sense: there exist a k € N and a constant 
M > 0 such that 


(10.2) ITP) <M. |p for every y € D(T), 


ie. |T(y) —T(w)| < M.||y - wy. The smallest integer possible k is called 
the order of T. Then lim T(y,,) = T(y) if the y, € D(T) converge uniformly 
to a y € D(T) as do all their successive derivatives of order < k: the others 
are not involved. 

The examples given in Chap. V, n° 34 in the case of R transpose easily to 
here, so long as one does not try to integrate the periodic functions on all of R, 
an integral of this kind clearly being divergent. In particular, every integrable 
function f on T defines a distribution Ty : p+ f[ y(u) f(u)dm(u), and every 
measure jt on T a distribution T,,: p+ f p(u)dy(u). These distributions are 
of order 0. A distribution such as p> fy (u) f(w)dm(u) is of order r; we 
shall see later that up to an additive constant!®, every distribution on T is 
of this type. 

It would be convenient to use the Leibniz notation T(y) = f y(u)dT(u) 
for distributions; the definition of the derivative 


(10.3) T'(y) = -T(¢’) 


18 A constant c is also the constant function u + c, so is also a distribution, namely 
precf y(ujdm(u) = c.m(¢). 
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of a distribution would then be written 
(10.4) [ewary =- fewer 


as in the formula for integration by parts (9.2) from which it is directly 
derived. Frowned on by Schwartz, this notation has not gained currency; but 
we shall use it on occasion. 

As in the case of functions and of measures on T, one may associate 
Fourier coefficients 


(10.5) T(n) = / u-"dT(u) = T(e-n) 


to every distribution T on the torus. Now the fact that the Fourier series of 
a function y € D(T) converges uniformly together with all its derived series 
clearly means that the series p = 3 P(n)en converges in the sense of the 
space D(T): we have 


(10.6) lv — en|| =0 for every k EN 


lim 
N-oo 
where, as always, the yy are the partial sums of the Fourier series of yp. 

One may thus “integrate” the Fourier series of y term-by-term with re- 
spect to any distribution T on the torus. Since the value of T on the function 
e,, is just, by definition, the Fourier coefficient T(—n) of T, one finds 


(10.7) T(y) = 5 T(-n)G(n) for every y € D(T). 


This relation resembles Parseval-Bessel more if one writes it in the form 
19) = f ewar(u) = aT (a). 


One may again interpret it as an expansion of T in Fourier series. Let us 
associate a distribution T's to every reasonable function f on T by putting 
Ty(y) = f y(u) f(u)dm(u). In particular, write E,, for the distribution asso- 
ciated to the function t+ e,(t) or ut u”, whence 


in(y) = ~(—n) for every y € D(T). 


Formula (7) can then be written 


(10.8) T(~) = _T(n)En(y) for every p € D(T) 


or, symbolically, in the form 
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this manner of writing has a sense if one defines the sum of a series }> T,, of 
distributions as the distribution T’ such that 


T(v) = > Tal) 


for every y € D(T), which assumes, at the least, (and, in fact, precisely’®) 
that the right hand side converges for every y € D(T). 

For example let us choose TJ’ = Ty where f is a regulated function on T, 
whence T'(n) = f(n). For every y € D(T), by Parseval-Bessel, 


Ts (9) 


l| 
ane 
fe 
= 
es 
= 
A 
= 
I 
M 
> 
= 
cs 
= 
I 
M 
ae 
> 
S 
3 
= 
= 
a. 
2 
l| 


where the fy are the partial sums of the Fourier series of f. From the distri- 
bution point of view this may be written 


(10.9) Ty(y) =limTy, (vy) ie. Ty =limTy,; 


in other words, qua distribution, the function f is the limit of the partial 
sums of its Fourier series. This does not mean that the latter converges to 
f in the usual sense! This is the one of the sleights of hand allowed by the 
theory of distributions ... 

On the other hand we note that the derivative T’ = DT of a distribution 
T has for Fourier coefficients the numbers 


(10.10) DT(n) = DT(e_n) = —T(e’_,) = —T(—2ine_n) = 2rinT(n); 


in other terms, another trick, the formula (9.3) is valid for every distribution 
on T. 


Can one characterise the functions n +> c(n) on Z which are the Fourier 
coefficients of a distribution? If T is a distribution, by definition one has an 
inequality of the form 


k 
(10.11) IT) < Mel, 
valid for every y € D(T). But if y(t) = en (t), we have, up to the factor 27i, 
that Dy(t) = ne,(t), Doy(t) = n7e,(t), etc. and so 
llen || = 1+ |2rn| + |2rnl2 +... + |2rn/|*, 


'9 If a series )> T,,(y) converges for any y € D(T), then T(y) = > Tn(y) is again a 
distribution, i.e. satisfies an estimate of the form |T'(y)| < M.||y||“. The proof 
is obtained without any calculation from the general theorems of functional 
analysis. 


290 VII — Harmonic Analysis and Holomorphic Functions 


an expression ~ |27n|* for |n| large (order of growth of a polynomial at 
infinity). One concludes from (2) that there exists an integer & such that 


(10.12) T(n) =O(|n|*) for |n| large. 


Conversely, every function c(n) satisfying c(n) = O(n") for one integer k € N 
defines a distribution by the formula 


(10.13) Ty) = >) (-n)P(n). 


First, the series converges since the product of a function “of slow increase” 
by a function of rapid decrease is clearly of rapid decrease. One has T(e,,) = 
c(n) since the Fourier coefficients of e, are all zero apart from the n-th 
(orthogonality relations). It remains to establish the continuity, in the sense 
of D(T), of the linear form yr T(y). 

First, 


(10.14) Dro(n) = (2rin)"A(n) 
for any r for every y € D(T) and so 


(10.15) $7 |(2min)"G(n)? = / ID" g(u)|? dm(u) < ||D" gl? 


since the mean value of a function is bounded by its uniform norm. Now we 
write (13) in the form 


(10.16) T(y) = c(0)a(0) + S> Saat (2rin)" Bn) 
with r=k+1 and put 
Un = c(—n)/(2rin)", Un = (2rin)" P(n). 


By (12), we have uy, = O(1/n) and therefore So |un|? < +00. The relation 
(14) and Parseval-Bessel show that also > |vn|? = ||D"y|[3_ < +00. The 
Cauchy-Schwarz inequality then shows that 


| S Un Un 


where M? = \> ||” depends only on T’. Since r = k + 1, we finally have a 
majoration 


< M.||D" lle < M.||D"¢|| 


(10.17) IT(e)| < le(0)]-19(0)| + M||D*** 9], 


which shows that T truly is a distribution. In conclusion: 
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Theorem 6. Let n +> c(n) be a scalar function on Z. For there to be a 
distribution T on T such that T(n) = c(n) for every n, it is necessary and 
sufficient that there exists ak € N such that c(n) = O(n). 


One says then that the function c(n) is of slow increase or is tempered. 
Example. Consider with Fourier the series 
sint — sin(2t)/2 + sin(3t)/3—...; 


Fourier calculates its partial sums by differentiating, as was done in Chap. V, 
n° 16 for square waves, which, for |t| < 7, puts them in the form 


t 1 ftcos(N+2 
if ( ae dx; 
0 


2 2 cos x /2 
an integration by parts shows that the integral tends to 0, whence 
t/2 = sint — sin(2t) /2 + sin(3t)/3—... for |t| <7. 


When Fourier presented his first manuscript to the Académie, Lagrange had 
objections; for example, he wrote the preceding formula in the form 


(x —t) =sint+4 sin(2t)/2 + sin(3t)/3+..., 


NlrR 


and differentiated to obtain 


1 
(*) —5 = cost + cos 2t + cos3t+..., 


then integrated the result between 0 and t, whence 
—t/2 =sint + sin(2t)/2 + sin(3t)/34+... 


and a superb contradiction! Fourier replied that the formula from which 
Lagrange started is valid only for 0 < t < 27 and that he consequently had 
no right to integrate the derived series”? from t = 0. 

He might have started by observing that it is not very catholic to dif 
ferentiate the initial series term-by-term since the series }> cos nt is clearly 
divergent for any t; but since he himself did so constantly, Fourier did not 
use this argument ... 

In fact, formula (*) makes sense (but is wrong) in the sense of distribu- 
tions. Using Euler’s relations it can be written as 


Sen =0, 


20 See Grattan-Guinness, Joseph Fourier 1768-1830, p. 172. 
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and if we interpret the left hand side as a series of distributions, (*) means 
that 
os p(n) =0 for every y € D(T). 
Z 
But the result should be y(0) or y(1) depending on whether you are in R 
or T. Now y(0) = 6(y) where 6 is the Dirac measure at the origin 1 on T. 
The correct formula is therefore 


Ten=s 


neZ 


an identity between distributions equivalent to the obvious formula 6(n) = 1. 
One should therefore replace (*) by 


, ¢ y(t) cos ntdt = 50(0) = ; f y(t)dt 


a formula equivalent to 


(0) = D5 An). 
Z 


The presence of the additional term $y(0) is easy to explain; the series (+) 
was indeed obtained by differentiating a series whose sum, equal to $ (x —t) 
for 0 < t < 2z, is discontinuous at t = 0 (or, in version T, at u = 1); 
the distribution obtained by differentiating it must therefore contain a Dirac 
measure at the origin as in the case of the function equal to 1 for t > 0 
and to 0 for t < 0 (Chap. V, n° 35, Example 2). Note in passing that if one 
considered distributions on R and not on T, the derivative of the function 
>> sin(nt)/n would include a Dirac measure at each multiple of 27. 


A method of stripping all the mystery from the distributions consists 
of considering their successive primitives. A primitive S' of a distribution T 
must, by definition, satisfy the relation S’ = T, i.e. 


S(De) = -T(¢) 
for every y € D(T). Then, if S exists, by (10) one has 7(0) = 0 and 
(10.18) S(n) = T(n)/2rin 


for n # 0. Since the sequence T(n)/n is slowly increasing, S will exist if and 
only if 


(10.19) T(0) = T(eo) = pare =0, 


§ 2. Elementary theorems on Fourier series 293 


the “integral” of the constant function 1 with respect to T. In this case S is 
unique up to an additive constant, namely the term $(0) of its Fourier series; 
if one chooses this to be zero one obtains a standard primitive of T’, which it 
is natural to denote D~!T or T‘—)); then 


(10.20)  D-!T(0)=0, D—T(n)=T(n)/2nin forn #40. 


When 7(0) 4 0, one may apply the argument to the terms of nonzero index 
of the Fourier series of T, whence a distribution S such that T = T(0) + 9”, 
i.e. such that 


(10.21) T(y) = T(0)m(y) — S(De) 


for every y € D(T); one can, here again, insist that S(0) = 0 to standardise S. 
The interest of this operation is that on applying it repeatedly to a distri- 
bution T such that T(0) = 0, ie. “orthogonal” to the constant functions, one 
increases the chances of convergence in the usual sense of the Fourier series of 
T since one divides its coefficients by the powers of n. Since these coefficients 
are of slow increase, it is clear that on choosing an integer r sufficiently large, 
the Fourier coefficients of the primitive of order r of T form an absolutely 
convergent series, in other words are those of a continuous function f. This 
means that T is the derivative of order r of the function f in the sense of 
distributions, or again that every distribution on T is given by a formula 


(10.22) T(y)=(-1)" / yO)(u) f(u)dm(u) + | y(u)dm(u) 


where c = 7(0) is a constant. Despite appearances, the notion of a distribu- 
tion on the torus is thus hardly more general than that of a function in the 
usual sense: one integrates its derivatives. 

We said?! in n° 7 that in the modern theory of integration, every function 
c € L?(Z) is the Fourier transform of a “square integrable” function on T. 
Though unable to prove this now, we remark that, by Theorem 6, there exists 
a distribution T such that T(n) = c(n); it is given by the formula (13). In 
fact, the latter is meaningful for every regulated function f since then the 
series )>|f(n)|? converges, hence also 5> c(—n) f(n); if one puts 


(10.23) T(f) = So e(-n)f(n) 
again in this case, the Cauchy-Schwarz inequality for series shows that 
? = |S e(-n) fim) < MILF 


where M? = So |c(n)|?. Hence a bound of the form 


21 This paragraph is not important in the sequel. 
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(10.24) IT(A)| < Mlflle < MIlfll 


for every regulated function on T, and in particular for every continuous 
function, which shows that the distribution T is a measure on T. In fact, T is 
defined by a measure of the form g(u)dm(u) where g is the square integrable 
function (a4 la Lebesgue) on T such that g(n) = c(n) for every n, and (24) 
is just the extension to these functions of the Cauchy-Schwarz inequality of 
Chap. V, n° 2. 
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§ 3. Dirichlet’s method 


11 — Dirichlet’s theorem 


When Dirichlet discovered Fourier’s work, at the beginning of the 1820s, he 
tried to justify it by rigorous methods. Fourier having discovered the general 
formula which we now write 


(11.1) Fn) = f Fwumdm(u) 


after dozens of pages of implausible calculations, and Dirichlet having heard 
from Cauchy that the sum ofa series is the limit of its partial sums, he started 
by calculating those of a Fourier series (we shall simplify the calculation a 
little by using convolution products): 


(11.2) fv=e fren =fx( 3 en) = f* Dw 
In| SN |n|SN 
where 
(11.3) Dy(u) = S- u = uw Neu Ste tu = 
|n|<N 
AN 2 NGA 
2 _ foru# 1 
l—u 


It follows that 

z, gr ay 
(14) flu) =F Dy(u) = fF (wom dm(v) 
On putting v = e(t) we have 


(11.5) Dy) = e((N + 1)t) — e(—Mt) = 


e(t) —1 
— e((N+4)t)-e(—(N+$)t) _ sin(QN + 1)at 
e(t/2) — e(—t/2) sin wt 


as one sees on multiplying the two terms of the fraction by e(—t/2) = e~™”* 
and using Euler’s formulae. The calculation obviously assumes that v 4 1, 
ie. t € Z; the value Dy(1) =2N +1 follows from definition (3). On passing 
to the language of periodic functions, the partial sums fy are again given by 


(11.6) f(s )= f fls-9Dwit t)dt = $ f(s g See Dea: 


sin mt 


Since we are dealing with the convolution products (on T) of f by the 
sequence of functions Dy, and since we would like the result to tend to f(s) 
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when WN increases indefinitely, it would seem, at first sight, that we should 
use the method of Dirac sequences expounded in n° 5. The condition 


is satisfied, because this mean value is the Fourier coefficient of index 0 of 
the trigonometric polynomial Dy. But the Dy change sign more and more 
often as N increases; it is neither obvious (nor even correct) that the integral 
of |Dy(u)| remains bounded as N increases. Finally, if one works on an arc 
ju —1| > 6 of T, then |Dy(u)| < |1 — u?%*| /6 by (3), which is insufficient 
to make Dy(wu) tend to 0. In short, a bad idea. 

Moreover, if the Dy did form a Dirac sequence, the Fourier series of every 
continuous function would converge uniformly to the latter by the lemma of 
n° 5: this would be Paradise. On Earth, although converging “almost every- 
where” in the sense of Lebesgue measure”? (a famous and very difficult result 
of Lars Carleson, 1966, valid for “square-integrable” functions in Lebesgue’s 
sense), it can still very well diverge for values of u forming an uncountable 
set?°. In other words, the method does not work because if it did it would 
lead to a false result. 

Having lived and died (1805-1859) too early to have heard of Lebesgue, 
Dirac, Carleson and even of Weierstrass’ approximation theorem, Dirichlet 
did not ask himself these questions and, using (4) — so in reality (5) — calcu- 
lated the difference 


(11.7) fin(u) — f(u) = / [f(uwo~") — f(u)] Du(v)dm(v) 


or, replacing v by v~! since Dy is symmetric, 


(1.8) fe) ft = f LEA (v1 9-9) ato) 


2 Tn Chap. V, n° 11 we defined the (Lebesgue) measure of an open U contained in 
a compact interval; n° 31, where we defined the integral of a positive lsc function 
on R, likewise allowed us to define the measure of any open U C R. This being 
so, a subset NV of R is said to be of measure zero if for every r > 0 there exists an 
open U such that N C U, m(U) < r. Granted this, a property — the convergence 
of a series of functions for example — is said to be true almost everywhere if the 
set of x where it is false is of measure zero. Every countable set is of measure 
zero, but not conversely. See the Appendix to Chap. V. 

23 The first example was that of the German P. du Bois-Reymond: “Before 1873, it 
was the general belief, of Lejeune Dirichlet, of Riemann, of Weierstrass, among 
others, that this series always converges to the limit f(x) when f(x) is contin- 
uous. Now, in trying to find a proof of this theorem, I came upon an argument 
to prove the contrary”. Letter of 1883 to the Frenchman G. Halphen (Dugac, 
p. 62). In 1926 the Soviet mathematician A. N. Kolmogoroff produced an in- 
tegrable (but not square integrable) function in the sense of Lebesgue whose 
Fourier series diverges everywhere. Newton would probably have said that one 
does not meet such functions in Nature. 
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The right hand side of (8) resembles the difference between the Fourier coef- 
ficients of indices —N — 1 and N of the function 


(11.9) gulv) = [F(uv) — F(u)]/(v — 1); 


but this function, as regulated as f for v 4 1, has, a priori, no meaning 
for v = 1; its integral may well diverge on a neighbourhood of this point, 
which prevents one from speaking of its Fourier coefficients; the integral (8) 
is defined only because it involves the quotient (v‘*t — v~) /(v — 1), an 
everywhere continuous trigonometric polynomial. 

Since v — 1 = e(t) — 1 ~ 27it when t tends to 0, i.e. when v tends to 1, 
one always has 


(11.10) lim|f(we) — f(u)]/(v 1) = f'(s)/2 
if this derivative exists at the point u = e(s) considered. The function g,, 


then has left and right limit values at every point v € T, so is regulated on 
all of T. In this case it is legitimate to write that 


(11.11) fin (u) — f(u) = Gu(-N — 1) — Gul); 


and to show that the left hand side tends to 0, it is enough to know that the 
Fourier coefficients of a regulated function tend to 0 at infinity, which the 
Parseval-Bessel inequality (7.12) makes obvious without recourse to Weier- 
strass’ theorem. Thus: 


Theorem 7. Let f be a regulated periodic function. Then 


(11.12) fu) = So Fru" = tim f(y 
In|<N 


at every point u€ T where f is differentiable. 


Corollary (Riemann). The behaviour on an open interval of the Fourier 
series of a regulated periodic function f depends only on the behaviour of f 
on this interval. 


If in fact f = g on an open interval U then the function f — g has a 
derivative at every point of U. Its Fourier series therefore converges to 0 at 
every t € U. This means that, for every t € U, only two cases are possible: 
(i) the Fourier series of f and g are simultaneously divergent at t, (ii) they 
are simultaneously convergent and have the same sum. Another translation: 
if two regulated periodic functions f and g are equal on an interval with 
centre t, then their Fourier series at t are either simultaneously divergent, or 
simultaneously convergent with the same sum on a neighbourhood of t. 

Dirichlet in fact went somewhat further than Theorem 1, for the sum of 
the square wave series, to mention just this one, is not differentiable in the 
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strict sense at the points where it is discontinuous; there it has only left and 
right derivatives; so one has to modify the preceding calculations. Now the 
symmetry of the function Dy shows that its integral over [-$,9] or over 
(0, 5] is equal to $3 this allows us to replace (7), or the R version, by 


(11.18) f(s) — SUF) + F(s-)] = 


i 
2 


= i [f(s +4) — f(st)]Dw(éat + i [f(s —t) — f(s—)]Dn(t)at. 


0 


The quotient 


[f(s +t) — f(s+)]/sin mt 


appears in the first integral. If f has a right derivative at the point s (obvious 
definition), this quotient tends to a limit when t > 0 tends to 0; for 0 <t < 3, 
this quotient then has, like f, left and right limit values; the first integral is 
thus, as in (11), the value at N of the Fourier transform of a regulated 
periodic function that vanishes on | 4, 1[, so tends to 0 as N increases. Same 
argument for the second integral. Whence a simple result, which has been 
refined in many ways (see for example A. Zygmund, Trigonometrical Series, 
Cambridge UP, 1969): 


Theorem 7 bis (Dirichlet, 1829). Let f be a regulated periodic function 
and fx the partial sum of order N of its Fourier series. Then 


: 1 
(11.14) lim f(s) = S[f(st+) + f(s—)] 
at every point where f has left and right derivatives. 


Exercise Dirichlet’s Theorem is still valid if the function t > | f(s +t) — 
f(s)|/|t| is integrable. Example: f(s+t) = f(s) +O(t%) when ¢ — 0, with an 
a > 0, in which case the graph of f at s has a vertical tangent. 


Example 1. Expansion of cot z as a series of rational fractions. Consider the 
function of period 1 on R given by 


1 
(11.15) f(t) = cos 2rzt for |t| < 5 


where z € C is not arational integer, for otherwise there would be no problem. 
Since f (—3) =f (5), the periodic function which extends f to all of R 
is continuous everywhere and it is clear that it satisfies the hypotheses of 
Theorem 7 bis. We have 


3 1 f2 
f(n) = i, cos tet. 2dt = = f 


i 
2 


jeer 4 eo 2rilztnyt dt = 


1 e2ti(z—n)t —2ri(ztny)t 


Zz. sin 1z 


= CM 


2 2ni(z—n) | Qri(z+n) 
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as we see using Euler’s formulae. Thus 


n~: SI TZ 


(11.16) m.cos2nat = S°(-1)" 


e2tint for \¢| ES 


NlR 


z2— 2 


an absolutely convergent Fourier series. In particular, for t = S, 


1 1 1 
(11.17) m.cotmz =z) z= 74 225° 5 
z = 


z2—n 


This is the formula due to Euler which we have already met several times, 
and established at Chap. IV, n° 18, using the infinite product for the sine 
function. The method we have just presented — to be found essentially in 
Fourier — is surely the simplest proof. 

For t = 0, (16) yields the expansion 


(11.18) . =* +225) ae 


sin 7z 2 92° 


Example 2. The Bernoulli polynomials. Recall (Chap. VI, n° 12) that the 
Bernoulli polynomials are defined by the recurrence relations 


(11.19) Bo(x) = 1, Bua) = kbpaa a) 
and by the condition 
(11.20) B,,(0) = B,(1) for k > 2. 


The inventor was not acquainted with Fourier series, but condition (20) is 
exactly what one needs to transform the By, for k > 2, into continuous 
periodic functions Bz, by putting 


(11.21) Bx(t)=B,(t)  for0<t<1 


as we did in Chap. VI a propos the Euler-Maclaurin formula. The hypotheses 
of the Dirichlet theorems are clearly satisfied. Adopting for once the notation 
an(f) = f(n), we have, integrating by parts and assuming k > 2, n 4 0, 


* ; 1 : / 
Gn (Biz) - | B,(t)e_n(t)dt = ral Bi,(t)e-n(t)dt; 
(19) now shows that 
(11.22) An (Bg) = kay (Bg_1) /2nin (k > 2, n #0). 


On writing this relation for k —1,k —2,...,2 we obtain 


(11.23) Gn (Be) = klay (BY) /(2nin)*-}. 
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Since B)(t) = t— 4 and since the Fourier coefficients of a constant vanish for 
n #0, we have 


1 


1 
eal) : | e, (t)dt; 
9 27in Jo 


27in 


a, (BY) = | te_,,(t)dt = 


the last integral is zero and what remains is 

(11.24) On (BY) = —1/2nin, 

whence finally 

(11.25) Gn (Be) = —k\/(Qnin)* = fork > 1, n£0. 


For n = 0, we have (k + 1)ao (By) = ¢ By, ,(t)dt = 0 by (20) if k > 1, and 
ao (Bj) = 1 trivially. 

Formula (25) shows that the Fourier series is absolutely convergent for 
k > 2, whence 


(11.26) S$ -en(t)/(2min)* = —B,(t)/k! fork > 2, 0<t<1, 
n#0 


the sum being taken over all nonzero n € Z. For k = 2 for example, one finds 


S “cos(2ant)/n?n? =t?-t+1/6 (O<t<1). 
1 


For t = 0, the left hand side of (26) reduces to )>1/(27in)*, so is zero for 
odd k; for k = 2p, p > 1, on the other hand, 


(11.27) So 1m?” = (—1)?*1(2)?Pap /(2p)! 


where b;, = B,(0) (Chap. VI, (3.7)). We should not forget that the left hand 
side is twice the sum of the Riemann series. 

For k = 1, the function By, equal to t — $ for 0 <t <1, is discontinuous 
at the points t € Z. On grouping the terms of index n and —n of its Fourier 
series, we again have 


1 co 
(11.28) a D_ sin(2ant) jan forO'< Hee 1, 


the series being pss for t = O or 1 as one may check without invoking 
Dirichlet. For t = = one obtains Leibniz’ series for 7/4. 
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12 — Fejér’s theorem 


We observed in the preceding n° that the Dirichlet kernels do not form a 
Dirac sequence in the sense of n° 5. At the end of the XIX‘ century, the 
Italian Cesaro had the idea of making divergent sequences (u,,) converge by 
considering their arithmetic means 


(12.1) Un = (ui +...+ Un) /n. 


If you apply this to the sequence 1,0,1,0,..., you will find that it then 
“converges” to 3. The method does not always work, even if one iterates — 
every sequence which tends to +oo is recalcitrant —, but it is reassuring at 
least to know that if the sequence converges to u in the usual sense, then it 
also converges to u in the Cesaro sense: if |u— un| <r for n > p and if one 
writes that 


Un tit eg of i Wipe oP), 


the first quotient is, for p given, < r for n large; on replacing each uz by u 
in the second, one commits an error bounded by (n — p)r/n < r, whence a 
total error < 2r for n large, qed. 

One may also apply this procedure to a series )* un, replacing the stan- 
dard partial sums s, = uj +...+u, by their means 


(12.2) On = (81 +... + Sn) /n. 


This allows one to make convergent series which are not; one finds again, for 
example, the formula 


conforming to the somewhat premature anticipations of Jakob Bernoulli 
(Chap. II, n° 7). The subject has been the object of much research, but 
it is rarely used outside of “fine” analysis. 

If one goes back to the Dirichlet formula 


in = se — 2) Dy(a)de = fx y(t) 


for the partial sums of the Fourier series of a function f it is clear that their 
arithmetic means are the functions f * Fy where the function 


(12.3) Fy =(Do+...+Dy-1)/N 


was introduced by L. Fejér (1880-1959). 

In contrast to the Dy, the Fejér functions form a Dirac sequence on the 
unit circle T. To see this, one has to calculate them. Putting g = e™”, one 
has, by (10.5), 
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Delt) = (PF -—@*") /(a-@"), 
whence, adding from 0 to N —1, 
N (q—@"") Fy(t) = 
=(qt@t...¢ PX ")-(cttqi+...4q7%"1) = 
=4q(?*% -1)/(?-1)-a* (@ * -1)/(¢?-1) = 
ue (Cal SoG) Gag) 


and finally 


N —N)\? 2 
(12.4) n= ee 
N(q-—q"') N sin* at 
for t £0, with Fy (0) = N by continuity or by (3). 

To show that the F'y form a Dirac sequence on T it then suffices to show 
that the Fy are positive (obvious), that their integrals on T are equal to 1 
(obvious, since this is so for the D,, hence for their arithmetic means) and 
finally that, for any r > 0 and 6 > 0, the contribution of the arc |u— 1| > 6 
of T to the integral of Fy is < r for N large or, equivalently, that 


(12.5) i, Fy (t)dt <r for N large. 
5<|t|<1/2 


But on this domain of integration, by (4) one has 
(12.6) Fy(t) < 1/N sin? 76, 
so that the Fy converge uniformly to 0 on 6 < |t| < $ for any 6 > 0, qed. 


Theorem 8 (Fejér). For every regulated periodic function f the arithmetic 
means of the partial sums of the Fourier series of f converge to $[f(t+) + 
f(t—)] for any t. If f is continuous in an open interval J, the convergence 
to f(t) is uniform on every compact K C J. 


The second assertion follows from the lemma of n° 5. 
To establish the first one writes, as in (11.13), 


(12.7) f x Fy(t) 


= i [F(t + 8) — F(tH)|Fiv(s)ds 4 i, [f(t — 8) — f(t) Fw(s)ds, 


the integrals being taken over (0, 3), and then argues as in n° 5. 


One may note in passing that assuming f continuous everywhere one 
obtains a proof of Weierstrass’ approximation theorem (without having used 
it beforehand ...). 
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Corollary. Let f be a regulated periodic function. Then 


N 
(12.8) lim_ > F(men(t) = SLR) + FE) 
_—N 


N—-oco 


at every point where the Fourier series of f converges. 


For the partial sums f(t), if they converge, converge to the same limit 
as their arithmetic means, which always converge to the right hand side of 
(8). The corollary does not claim that the relation (8) is true for arbitrary t 
and f. 


13 — Uniformly convergent Fourier series 


Dirichlet’s theorem demonstrates the simple convergence of the Fourier series 
of a regulated periodic function at all points where it has left and right deriv- 
atives. In the case of the square waves we have shown by ad hoc calculations 
(Chap. III, n° 11) that in fact the series converges uniformly on every com- 
pact interval not containing discontinuities of f. One may refine the proof of 
Theorem 7 so as to cover this case and many others, for example the series 
(11.28). 

The arguments which follow being somewhat subtle, the reader is invited 
to consider them more as an exercise. 


Theorem 9. Let f be a regulated function on T and J an open arc on which 
f is a primitive of a regulated function (is, for ecample, of class C'). Then 
the Fourier series of f converges to f uniformly on every compact arc k C J. 


The proof we are going to set out calls on current techniques in functional 
analysis and can be divided into several stages. 


(i) Consider again the function 


(*) Gulv) = [Flue) — f(w)|/(o — 1) 


that we used in proving Dirichlet’s theorem. As we saw then, gy, is regulated 
on T if f has left and right derivatives at u, so, under the hypotheses of 
Theorem 9, for every u € J. Then 


fin(u) — f(u) = gu(-N — 1) — gulN) 
and the theorem reduces to showing that, as N — +00, the functions 
wr gu(N) = Gn(u) 


converge to 0 uniformly on every compact K of J, i.e. that for every r > 0 
there exists an N such that 
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(13.1) (ue kK) & (Inf > MS lA) <r. 


(ii) Consider the vector space?* L1(T) of regulated functions on T, en- 
dowed with the norm ||f|]; = [|f(v)|dm(v). Then g, € L1(T) for every 
u € J, and the simplest estimate for the Fourier coefficients of an integrable 
function shows that 


(13.2) IGn(u!) — Gn(u")| = [Gur (m) — Gur" (m)| < Igur = gurrlhy 


for any u’ and u” € J. 
Suppose we have shown that the map u + g, of J in L'(T) is continuous, 
i.e. that for every u € J and every r > 0 there exists an r’ > 0 such that 


(13.3) (ue J) & (wi —ul <r’) => |lgw — gull, <r. 
The relation (2) then shows that 


(13.4) (u’e J) & (ju’ —ul <7’) =>} |G, (u') — Gy(u)| <r for every n. 


This means precisely that the functions G, are equicontinuous on J (Chap. III, 
n° 5). The fact that the G,,(u) converge to 0 uniformly on every compact 
K Cc J will then follow from the following general lemma: 


Lemma. If a sequence of functions f, defined and equicontinuous on a com- 
pact set K converges simply on K, then it converges uniformly on K. 


Suppose that f is the limit of the f,, and let us choose an r > 0. For every 
a € K there exists an open ball B(a) with centre a in K such that 


x € Bia) => |fn(z) — fr(a)| <r for every n; 


this is the definition of equicontinuity. The inequality remains valid for f by 
passage to the limit, which proves the continuity of f; since | fn(a) — f(a)| <r 
for n large one deduces that, for n large, 


[f(@) — fr(@)| < 3r 


for every x € B(a). But, since K is compact, one may (Borel-Lebesgue) cover 
it by a finite number of balls B(a;). The above inequality is then, for n large, 
valid on all these balls, so on K, qed. 


(iii) To prove the continuity of the map ut g, of J into L(T), let us 
first consider, in this part of the proof, the numerator f(uv) — f(u) of (*). 
This is the difference between, on the one hand, the function fy, :u- f(uv) 
obtained by “translating” the function f, and on the other hand the constant 


24 The authentic L1 space in Lebesgue theory contains many other functions, but, 
since it certainly contains the regulated functions, this is what we deal with here. 
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function c,: vt f(u). Since |lew — cu||, = |f(u’) — f(v)|, it is clear that 
u' Cc, is a continuous map of J into L'(T). 

As for u +> fu, this is a continuous map of T (and not only of J) 
into L1(T). This is obvious if f is continuous on T, for f being now uni- 
formly continuous on T, we have |f(u'v) — f(w’v)| < r for every v € T, 
so also || fur — fur||, < 7 so long as |u’ — u”| <r’. In the general case, we 
may choose, thanks to the lemma of n° 8, a function y € C°(T) such that 
J lf) — e(v)|dm(v) = ||f — vl, <r. Since we are integrating with respect 
to an invariant measure, we again have || fu — Yul|, <r for every u € T. If 
we now choose functions y € C°(T) which converge to f in L1(T), the cor- 
responding maps u +> ~, of T into L'(T) converge to ut f, uniformly on 
T. A uniform limit of continuous functions with values in any metric space 
being again continuous, the required result follows. 

So we see that the numerator of the formula (*), considered as a function 
of u € J with values in L'(T), is continuous. 


(iv) Next we have to take account of the denominator v — 1 and, to do 
this, use our hypotheses. We shall first give the proof in the case where f = 0 
on J; and show later that the general case reduces to this. 

Since the compact sets K and T — J are disjoint their distance d is > 0. 
Since |uv — ul = |v — 1], we see that 


(13.5) (ue k) & (u-1|<d) = wed 
= f(w) = flu) =0. 


When we restrict to the u € K, the functions of v appearing in the numerator 
of the formula (*) are thus all zero on the arc |v — 1| < d of T. Let us put 


(13.6) h(v) =(v—1)7' if |u—1]>d,  h(v) =0 if not. 
The formula that defines g,, shows that, for u € K, 
(13.7) gu(v) = h(v) [fu(v) — cu(v)] for every v € T. 


This is essentially the definition of g, on the arc |v — 1] > d and, by (5), 
reduces to the identity 0 = 0 on the arc |v — 1| < d. 

Now we have |h(v)| < 1/d for any v € T by (6). The relation (7) showing 
that gu = h(fu— Cu) for wu € K (though not for every u € T) and the map 
ut> fu— cy of T into L'(T) being continuous, by point (iii), it remains to 
show that multiplication by the function h, which is bounded and independent 
of u, preserves continuity. This is no more difficult than in the framework of 
complex valued functions: it is enough to write that 


bf — hf", = / [A(v)|-1F"(v) — fw) |dm(v) < All A = 


for any f’, f’” € L'(T), where ||h|| = sup |h(v)| as always. Since, for u’, ul” € K 
sufficiently close, the distance from f’ = fy — cy to f” = fy — cy is 
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arbitrarily small, so likewise is the distance from gy to gy, which proves the 
theorem in the case where f = 0 in J. 


(v) It remains to pass to the general case. The arc K being compact, and 
the arc J open, the distance d of K to the compact T — J is > 0, so that 
the open arc J’ of T defined by d(u, kK) < d/2 satisfies K Cc J’ C J. By 
modifying the graph of f outside J’ one may construct a function g which, 
on all of T, is a primitive of a regulated function and which, on J’, coincides 
with f. Since f —g vanishes on J’ its Fourier series converges uniformly to 0 
on K by section (iv) of the proof. Now the Fourier series of g converges to g 
uniformly in T (n° 9, Theorem 4) and so to f uniformly on K. The relation 
f =(f-—g) +g then completes the proof. 
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In Chap. II, n° 19, which the reader is strongly urged to review, we said that 
a function f defined on an open subset U of C is analytic in U if for every 
a € U there exists a power series in z — a which, on a sufficiently small disc 
of centre a, converges to f(z). In fact it represents f(z) in the largest disc 
D Cc U where it converges, for the sum of this power series is analytic in 
its disc of convergence (Chap. II, n° 19, Theorem 14) and since it is equal 
to f on a neighbourhood of the centre of D, it is equal to f everywhere in 
D by virtue of the principle of analytic continuation (Chap. II, n° 20); the 
same argument shows that the one and only power series representing f on a 
neighbourhood of a is the Taylor series of f at a. We know that it converges, 
but we still do not know up to where it converges ... 

We have also shown that, if the function f is analytic, it has a derivative 


(*) f'(a) = lim[f(a + h) — f(@)]/h 


in the complex sense at each point a € U; the latter can also be obtained 
by differentiating the power series representing f term-by-term on a neigh- 
bourhood of a. The existence of the limit (+) shows that as a function of the 
real variables « = Re(z) and y = Im(z) the function f has partial derivatives 
satisfying the Cauchy formula 


(*x) Dof =iDif (= if’). 


On the other hand we have shown (Chap. ITI, n° 20, corollary of Theorem 21) 
that, conversely, every holomorphic function, i.e. possessing continuous par- 
tial derivatives satisfying (**) in an open set U, has a complex derivative (*) 
and that its differential can be written in the form 


(x * *) df = f'(z)dz = f'(z)(dx + idy). 


In the following n° we shall show that a holomorphic function is necessar- 
ily analytic, by a method that exploits Fourier series, after which the terms 
“analytic” and “holomorphic” will become synonymous, as we have already 
announced several times in earlier chapters. Then we shall expound the sim- 
plest consequences of this result, without seeking to enter into the detail of a 
theory to which hundreds of mathematicians have, since Cauchy, added their 
contribution from their grain of sand to the Empire State Building; Rem- 
mert’s two volumes, 650 very condensed pages, can scarcely cover the elliptic 
functions and not at all the modular and automorphic functions, Riemann 
surfaces, analytic differential equations, special functions, etc., not to speak 
of the generalisations to several variables. 
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14 — Analyticity of the holomorphic functions 


Having recalled these preliminaries let us consider a function f(z) defined 
and holomorphic on an open disc D : |z| < R. We would like to show that it 
is represented in all this disc by a power series 


(14.1) $O= Do ae 


As we have seen in n° 1 of this chapter, or in Chap. V, n° 5, this essentially 
reduces to showing that the function 


(14.2) An (Tr) = J ferwummaim(u) 
is, for every n € Z, proportional to r”, using only the Cauchy condition or, 


equivalently, the existence and the continuity of f’(z). 
In this direction we write 


(14.3) a,(r) = | flre(t)]e_n(t)at 


and calculate the derivative of a,,(r). We have to perform a differentiation 
under the { sign, an operation examined in Chap. V, n° 9, Theorem 9: this 
is permitted if the function of r and t that one is integrating has a partial 
derivative with respect to r and if the latter is a continuous function of the 
pair (r,t). The factor e_,,(t) poses no problem. The factor f[re(t)] neither: f 
is C and, for t given, the map r+ re(t) is C°. The general relation (21.2) 
of Chap. II, n° 21, namely that 


(14.4) < flat) = Fla), 


valid if f is holomorphic and if g is a C! function of the real variable r, then 
shows that in our case 


d d 

(14.5) qd lre(t)] = fre] s rel) = f'Tre(le) 

is a continuous function of the pair (r,t). Thus 
d 1 

14.6 —Ay(r) = ‘Ir _n(t)dt. 
(146) Sinlr) = fs" re(dle(en(tat 
Since on the other hand, by the same argument, 

d d 
(14.7) ai re) = f'lre(t)| Fre®) = 2nirf'[re(t)]e(t), 


(6) can again be written 
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dir Fan(r) = f e-n(t) Zilre(t)).at 


An integration by parts then gives 


1 


Si = e_,(t)f[re(t)] 


ag + 2min | e_,,(t) f[re(t)]dt 


0 


since —27ine_,(t) = e!_,,(t). In the preceding relation the integrated part 
is zero by periodicity and the integral on the right hand side is just a,(r). 
Whence the relation 


(14.8) ro) = on) 


valid for 0 <r < R. 

Here we have a particularly banal differential equation. Putting b,(r) = 
an(r)r—” for r > 0 and applying the chain rule, one finds that b/, (1) = 0; the 
function b,(r) is therefore constant, whence 


(14.9) An(r) = anr” 


with a coefficient a, independent of r. 
For r < p < R, one has, by (2), 


(14.10) la,r”| < sup |f(z)| = Mz(p) < +00. 
lz|<e 


For n < 0, r” increases indefinitely when r tends to 0; (10) then shows that 


(14.11) Gn = 0 for n < 0, 


so that the Fourier series }*a,(r)u” of f(ru) reduces to the power series 
Yo anz” for z = ru. Since, on the other hand, the function u + f(ru) is 
of class C+ on T, its Fourier series converges absolutely and represents the 
function in question everywhere. 

In particular, the power series }* an,z”" converges for |z| < R. One may 
furthermore see this without invoking Theorem 8: choose a p such that |z| < 
p< R, put |z| = gp with q < 1, and write 


(14.12) |anz"| = |anp"| 9g” < My(p)q”. 
In conclusion: 


Theorem 10 (Cauchy, 1831). Let f be a holomorphic function in an open 
set U in C. Then f is analytic in U and, for every a € U, the Taylor series 
of f at a converges and represents f in the largest open disc with centre a 
contained in U. 
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It is enough, in the preceding arguments, to replace the disc |z| < R by 
the largest disc |z — a| < R in question or, if one prefers, to consider the 
function f(a +z). Now the only power series that can possibly represent f 
on a neighbourhood of a is the Taylor series of f at a as we know (Chap. II, 
n° 20). Whence the theorem. 

If you believe that Cauchy understood everything immediately, you are 
in error. He perfectly understood Fourier series and integrals from 1815, and 
in 1822 had obtained the integral formula for a circle for the holomorphic 
functions (i.e. satisfying his PDE) by quite another method. Now one needs 
only a few lines of simple calculations to pass from there to Theorem 10 (see 
n° 21). Freudenthal, an excellent Dutch mathematician who has seriously ex- 
amined Cauchy’s works, voices the hypothesis, in his notice in the DSB, that 
he had forgotten his own results. His political, religious and social activities 
probably occupied too great a place in his life?° ... 


15 — The maximum principle 


Let f be a holomorphic function in an open U C C and again consider the 
Cauchy formula (14.2), which, for n = 0, can be written as 


(15.1) fla) = f Fla + ru)dm(u 


for every a € U, where one integrates with respect to the invariant measure 
of T and where r is small enough for U to contain the closed disc |z— al < r. 
This implies 


(15.2) |f(a)| < sup|f(a+ru)|. 


Assume now that f has a local maximum at a, i.e. that there exists an r > 0 
such that 


(15.3) |f(z)| < |f(@)| for every z such that |z— al] <r. 


25 On Cauchy, see also Bruno Belhoste, Cauchy, un mathématicien légitimiste 
au XIX° siécle (Paris, Belin, 1985) and Augustin-Louis Cauchy. A Biogra- 
phy (Springer, 1991), the mathematical information in which does not replace 
Freudenthal’s notice. The book by C. A. Valson, La vie et les oeuvres du Baron 
Cauchy (1868) deserves to be read as a particularly comic example of would-be 
edifying hagiography, but is difficult to find; it was demolished immediately by 
Joseph Bertrand (Bull. de la Soc. Math. de France, 1, 1870) who, while insisting 
on the importance of Cauchy’s discoveries, recalled his irresistible need to publish 
(more than 750 articles), frequently several times, incorrect, incomplete results, 
such as he had found the same day before breakfast, as we say nowadays. The 
Cours d’analyse of 1821 has recently been republished in facsimile by Ellipses; 
reading it could be a very useful exercise (to detect the errors in the argument). 
On teaching at the Polytechnique, see Bruno Belhoste, Amy Dahan Dalmedico 
and Antoine Picon, La formation polytechnicienne 1794-1994 (Dunod, 1994), a 
collection of articles by twenty or so historians and in the main very interesting. 
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Applying Parseval-Bessel to the Fourier series 


f(atru) = S- Cnt” u”, 


one obtains by (3) 


So eal? 72" = i flat ru)|? dm(u) < |f(@)|? = leol?, 


whence c, = 0 for every n > 1. The power series for f at the point a then 
reduces to its constant term, so that there is a disc of centre a on which f is 
constant. 

Now, in Chap. II, n° 20, we proved a principle of analytic continuation 
stating that if, in a connected open set U, two analytic functions coincide on 
a neighbourhood of a particular point of U, then they coincide in all of U. If 
in particular a holomorphic function in U is constant on a neighbourhood of 
a particular point of U, it is constant in U. Conclusion: 


Theorem 11. Let f be a holomorphic function in a connected open set U. 
Then f is constant if at a point of U it has either a local maximum or a non 
zero local minimum. 


The case of a local minimum reduces to the preceding case on considering 
the function 1/f: this is defined and holomorphic on a neighbourhood of a 
local minimum of f and has a local maximum there; 1/f (and so f) is thus 
constant on a disc, so f is constant on U. 

The connectedness hypothesis is essential: if U is, for example, the union 
of two disjoint open discs D’ and D”, then the behaviour of f on D” has no 
bearing on its behaviour on D’; f might be equal to 1 in D’ and to e* in D”. 

An open connected set is generally called a domain; one most often uses 
the letter G (in German, domain = Gebiet) to denote connected open sets. 


Corollary 1. Let G be a bounded domain in C, K its closure, F = K —G 
its frontier and f a function defined and continuous in kK and holomorphic 
in G. Then 


(15.4) IIfllo = WIflla = lflle- 


Since G is bounded, K is bounded and closed, hence compact. The con- 
tinuous function |f(z)| therefore attains its maximum at a point a € K. If 
a € G, Theorem 5 shows that f is constant in G, hence in K, and the corol- 
lary is obvious. If f is not constant, the maximum of |f(z)| is thus attained on 
F, whence ||f|| ¢ = ||f||,-- But since f is continuous in K, its value at a point 
of F is the limit of values taken at points of G, whence || fl 7 < |lflle < Ilfllx 
since GC K, qed. 
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Corollary 2. Let G be a bounded domain and (fn) a sequence of functions 
defined and continuous on the closure K of G and holomorphic in G. Assume 
that the fn converge uniformly on the boundary F' of G to a limit function. 
Then the fn converge uniformly on K and the limit function is holomorphic 
in G. 


Consider the functions fp, = fp — fg. Cauchy’s criterion for uniform 
convergence shows that, for every r > 0, one has || fpq||-, < 7 for p and q large, 
and thus (Corollary 1) ||fpql|;,, <r. The fp, therefore converge uniformly in 
kK, so in G, and it remains to apply Theorem 17, to be found below (n° 19). 


Corollary 3 ((H. A.) Schwarz’ lemma). Let f be a function holomorphic 
and bounded on a disc |z| < R and having a zero of order p at the origin. 
Then 

I) <M|z/RP where M = sup |f(2)). 


The assumption about f implies that f(z) = z?g(z) where g is, like f, 
the sum of a power series in |z| < R. The relation |z?g(z)| < M shows that 
\g(z)| < M/r? for |z| = r < R, so also, by the maximum principle, for |z| <r. 
On letting r tend to R, one deduces that 


lg(z)| < M/R’, — whence |f(z)| < M]z|P/RP 


for every z, qed. 
Theorem 11 can be extended in part to unbounded domains, but this is 
more difficult and rather constitutes an exercise: 


Theorem 12. Let G be a domain in C and f a function defined, continuous 
and bounded on the closure of G and holomorphic in G. Then 


(15.5) IIflle = IIflle 
where F = G—G is the boundary of G. 


The case where G is bounded having been treated already, let us assume 
G unbounded. First consider the simplest case, where f tends to 0 at infinity, 
ie. where, for every ¢ > 0, one has |f(z)| < for every z € G of large enough 
modulus. Since f is continuous in G, the inequality | f(z)| > ¢ defines a closed 
subset K of G; since |f(z)| < for |z| large, K is bounded, so compact. There 
is therefore an a € K where the function |f(z)| attains its maximum relative 
to K. For every z € G one then has |f(z)| < |f(a)|, either trivially if z € K, 
or because | f(z)| < e if z ¢ K. Theorem 11 then shows that a € G—G (qed), 
unless f is constant, in which case there is nothing to prove. 

Now let us pass on to the general case of a function bounded in G but not 
necessarily tending to 0 at infinity and assume for example that |f(z)| < 1 
on the boundary F of G; we are then to show that |f(z)| < 1 in all G too. 
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Assume |f(a)| > 1 for some a € G and consider a closed disc D : |z — a] <r 
contained in G. Theorem 11 shows that the maximum MM of | f| on the bound- 
ary of D is > 1. Consider now in the domain H = G — D the functions 


(15.6) fn(2) = rf(z)"/M"(z — a). 


Since f is bounded in G, the introduction of a denominator z — a shows that 
the f, tend to 0 at infinity in H. One is therefore in the particular case 
examined first. Now the boundary of H is clearly the union of the boundary 
F of G and that of the disc D. On F, by hypothesis | f(z)| < 1, and since 
M and |(z —a)/r| are >1, one has |f,(z)| < 1 on F. The same result holds 
on the boundary of D since there |f(z)| < M and |z — a| = r. Thus we see 
that the function |f,,(z)| is < 1 on the boundary of H, and since it tends to 0 
at infinity we conclude that |fn(z)| < 1 in H. The exponent n in (6) being 
arbitrary, this forces | f(z)/M|< 1 in H. This relation also being satisfied in 
D, it holds everywhere in G, qed. 

The hypothesis that the function f is bounded in G and not only on its 
boundary is essential in the above. All this has been prodigiously refined. 


16 — Functions analytic in an annulus. Singular points. Meromor- 
phic functions 


The arguments of n° 14 in fact apply to a function defined in an annulus 
C: Ry < |z| < Re and in particular on a disc with its centre deleted if 
R, = 0. For every circle |z| = r contained in C' the Fourier coefficients of the 
function f(ru) are again of the form a,r”, but the argument showing that 
Gy, = 0 for n < 0 no longer applies since, even in the case where R; = 0, the 
function, for example 1/z, has no reason to be bounded on a neighbourhood 
of 0. What survives is the Fourier series of f(ru), namely 


(16.1) Sant. age” with aye” = f f(ruju"dm(u), 


where this time the sum is extended over Z, converges absolutely and repre- 
sents f in C' by Theorem 4 of n° 9. Here again, one may see the convergence 
directly. Let us choose numbers r; and rz such that Ry < ry < |z| < ro < Ro 
(strict inequalities) and let M be the uniform norm of f on the compact 
annulus C’ delimited by the circles of radii r; and rz. Now |a,z"| < M by 
(1), since | f(ru)| <M, whence 


a lear e | lefra|"< Me lesra|” or: i 0; 
(16.2) Janz"| = { lanrt |. le/ri|" <M |z/ri|" for n< 0; 


since |z/r2| < 1, the first inequality proves the absolute convergence of the 
“positive” part of the series (1), and since |z/r,| > 1, the second proves that 
of its “negative” part. 
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The series (1) even converges normally in C’ and so on every compact?® 
K CC. Thus, in C’, 


mi? 
lanz”| < lanr?| if n>0 

~ | lanrz| if n <0; 
now we know that the series }>a,z” converges absolutely in C, thus for 
Z =P, or rg; whence, in C’, a bound by a convergent series independent of z. 
In conclusion: 


Theorem 13 (Laurent). Let f be a holomorphic function in an annulus 
C: Ry < |z| < Ro. Then we have a series expansion 


(16.3) f(2= S- Anz” with a, = mn Lf fruyu-ram(u, 


neZ 
the series converging normally on every compact K CC. 


An expansion of this type is called a Laurent series; it is the sum of a 
power series in z and of a power series in 1/z. The first converges at least for 
|z| < Rz and the second for |z| > R, since a power series necessarily converges 
on a disc. This allows us to write that, in C, we have a decomposition f(z) = 
g(z) + h(z) of f into a function g holomorphic for |z| < Rz and a function h 
holomorphic for |z| > Ry. 

We may write (16.3) a la Leibniz like the Cauchy formula of n° 1. Putting 
¢ = ru we have a, = f f(¢)¢~"dm(u); but for u = e(t) we have d¢ = 
2rire(t)dt = 2ri¢dm(u). Whence 


Qi, = i HOC al, 


the “line” integral being taken along any circle t + re(t) contained in C. 
Cauchy’s theory will illuminate this point and, in particular, will explain 
why the result is independent of the circle |¢| = r chosen. 


Theorem 13 serves mainly to study the behaviour of a holomorphic func- 
tion on a neighbourhood of an isolated singular point a, i.e. of a function de- 
fined and holomorphic on a neighbourhood of a, except at the point a itself. 
There one has a series expansion f(z) = >> cn(z—a)”, whence the distinction 
between the poles, where the series includes only finite number of nonzero 
terms of degree < 0 — the minimal degree, its sign changed, is called the order 
of the pole in question —, and the essential singular points where it includes 


26 Such a compact subset is contained in C’ if one chooses the radii r, and r2 
suitably: the continuous real function z +> |z| attains its minimum and its maxi- 
mum on K, which are strictly contained between the radii of C since the limiting 
circles of C’, which do not lie in C, do not meet K. 
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infinitely many, the case for example of the function exp(1/z) = D> 27" /n! at 
z = 0. It is also useful to define the order of a zero a of f, i.e. of a point where 
f(a) = 0: this is the degree of the first nonzero term of the power series of f 
at a. 

This leads to the fundamental concept of a function meromorphic in an 
open set U: this is a function f defined and analytic in U — D, where D 
is a discrete subset of U (i.e. such that, for every a € U, there exists a 
disc of centre a containing a finite number of points of D), and having only 
polar singularities at the points of D. This, for example, is the case of the 
elliptic functions of Chap. II, n° 23 and, in fact, the “right” definition of the 
elliptic functions, for there are many, other than the Eisenstein series, is to 
impose on them just that they should be simultaneously doubly periodic and 
meromorphic in C, as Liouville discovered (n° 18). 

We note that, if D’ is the set of zeros of a meromorphic function f in U, 
then the union DU D’ is again a discrete subset of U. This results from the 
fact that, if a is any point in U, then f(z) = (z—a)?g(z), where g is a power 
series whose constant term is not zero, so such that g(a) # 0; on a small 
enough disc of centre a we again have g(z) 4 0 since g is continuous, so that 
on a neighbourhood of a the function f can have no other zero or pole than 
the point a itself. On the other hand, the zeros or poles of a function having 
an essential singular point at @ may accumulate at a: the zeros of sin(1/z) 
converge to 0. 

One may perform the usual algebraic operations on the functions mero- 
morphic in a given open set U: sum, product, quotient; as we shall see, one 
again obtains meromorphic functions in U. 

The case of a sum f +g is obvious: if f and g have poles at the points 
of two discrete subsets D and D’ of U, the function f + g is holomorphic 
outside DU D’ and it is clear that at a point of DU D’ it has at most a 
pole; “at most” because the polar parts of the Laurent series of f and g at 
a common pole may cancel each other. For fg, holomorphic outside DU D’, 
one observes that at a point a € DU D’ one has the relations 


fz) = file)/(z—a)?, gz) = gi(z)/(z- a)? 


where f; and g; are holomorphic on a neighbourhood of a and nonzero at a. 
It is then clear that fg has a pole of order p+q at a. Of course it can happen 
that a pole of f is neutralised by a zero of g. 

The case of the quotient f/g reduces, as always, to that of the reciprocal 
1/g(z) of a meromorphic function. On a neighbourhood of a pole a of g one 
has g(z) = gi(z)/(z — a)? where gi(z) is a power series such that gi(a) 4 0. 
The function g; has a reciprocal 1/g:(z) holomorphic on a neighbourhood 
of a; the formula 1/g(z) = (z — a)?%/gi(z) then shows that the pole a of 
order q is transformed into a zero of order q of 1/g(z). At a point a where g 
is holomorphic one has g(z) = (z — a)%g1(z) where g; is a power series not 
vanishing at a, whose reciprocal is thus holomorphic on a neighbourhood 
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of a, whence clearly a pole of order q for 1/g(z) if g has a zero of order q 
at a. Finally we see that 1/g(z) is holomorphic outside the zeros of g, which 
are the poles of 1/g, the poles of g contrariwise providing the zeros of 1/g. 
Since the zeros of g form a discrete set in U, the function 1/g is therefore 
meromorphic in U. 

Laurent series can be manipulated like power series. The domain of con- 
vergence of a series such as f(z) = > anz” is necessarily an annulus C’ since 
it is the sum of a power series in z and of a power series in 1/z which converge 
for |z| < Rg and |1/z| < Ry respectively. The multiplication formula 


as Anz” > baz” = S- CZ with cy, = Se Apbn—p 


valid for power series still applies, restricted to an annulus where the two 
series converge: they then converge absolutely, therefore unconditionally, so 
on multiplying term-by-term one obtains a double series )> aybgz?t4 which 
converges unconditionally (Chap. IT, n° 22) and in which one may reorder the 
terms arbitrarily (Chap. II, n° 18, Theorem 13: associativity), for example 
as a function of the value of p+ q. 

One may also differentiate a Laurent series term-by-term; the simplest 
way to see this is to write f(z) = g(z) + h(1/z) where g and h are power 


series, whence 
f(z) =g'(2) — W(A/2)/2? 


by the chain rule for analytic functions (Chap. II, n° 22, Theorem 17); 
as we know, g/(z) is obtained by differentiating the “positive” part of the 
series ))a,z" term-by-term; since A(z) = >°7,,59@-nz" and so A’(z) = 
endo NA—-n2"", it follows that 


Ri (1/z)/2? = S- jae Oe > nane t=- S- no,Z > 


n>0 n>0 n<o 


and finally 


fej= S- NAn zr + > NAnz” t= ye NAnz”! 


n>0 n<o0 


as one had hoped, the Laurent series of f’ converging in the same annulus C 
as f. Note that there is no term in 1/z in the result. 


An essential difference from power series will appear when one looks for 
a primitive of f, ie. a holomorphic function F’ such that F’(z) = f(z) in 
C. The function F is, like f, represented by a Laurent series and as we 
have just seen the derived series contains no term in 1/z. The problem can 
therefore have no solution if the series f(z) contains one. The analogous 
problem for a real variable had, in the XVII" century, defied the efforts of 
several mathematicians before Newton and Mercator, and Newton himself, 
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with his systematic use of Laurent series having a finite number of terms of 
negative degree, is very reticent on this subject. This “detail”, on which the 
whole of Cauchy’s “residue calculus” is founded, apart, the existence of the 
primitive when a_; = 0 is as obvious as for the case of power series: one 
puts F(z) = So anz"*!/(n+ 1). The arguments employed for power series in 
Chap. I, n° 19 or, equivalently, the fact that in the domain C' of convergence 
of the series, the a, of positive index and those of negative index are bounded 
by geometric progressions, show that these operations — differentiation or 
“integration” term-by-term — lead to series converging in the same annulus 
as the initial series. 

In the case where f contains a term in 1/z, one may again consider the 
function F(z) = Slanz"*!/(n +1), where one forgets the term of index 
n = —1; instead of f = F” one obtains the relation 


(16.4) f(z) = F'(z) + a-1/z. 


The coefficient a_,, the radical obstruction to the existence of a primitive of 
f in the annulus C, is called the residue of f; more generally, if one considers 
a function f holomorphic on a neighbourhood of a point c of C except at 
the point itself, which is then an isolated singular point of f, the residue of 
f at cis by definition the coefficient a_; of 1/(z—c) in the expansion of f 
as a Laurent series )>an(z—c)” about the point c; one writes Res.(f) or 
Res(f,c). The formula (3) applied for n = —1 shows that 


1 
(16.5) Res.(f) = | fle+ ru)rudm(u) = i f (etre?) re?™* dt 


or, a la Leibniz, 


(16.6) Res,(f) = me _ f(2)dz 


with an integral taken around the circle |z — c| = r as above; naturally one 
has to choose r small enough that, except at the point c, the function f will 
be holomorphic in an open set containing the closed disc |z — c] < r. 

The preceding arguments show more generally that if f is a meromorphic 
function in an open set U, the existence of a primitive F' of f in U presup- 
poses that Resa(f) = 0 at every pole a of f. This necessary condition is 
not sufficient, even if f is holomorphic, except in very particular open sets 
(“simply connected”, i.e. homeomorphic?’ to a disc). To study this question 
requires the complete Cauchy theory, i.e. the use of the line integrals which 
we shall develop later in this treatise (Chap. VIII). 


To return to the function 1/z, one might claim that the function 


27 Two metric or topological spaces X and Y are said to be homeomorphic if there 
is a continuous bijection X —> Y with continuous inverse. 
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1 
(16.7) Log z = log |z| +iargz = 5 log (x? + y?) + iarctan y/x 


of Chap. IV, n° 14 and 21 is its primitive; now, in C*, the latter is all but 
a function in the strict sense of the term, as we know because of the same 
problem for arg z. There are open sets U in which the function 1/z has a 
primitive, namely those in which the pseudo-function Log z decomposes into 
uniform branches: for we know [Chap. IV, §4, section (v)] that, if Z is such 
a branch, one has 


L(z) = L(a) + S51 2/a)"/n 


on a neighbourhood of every point a € U, so that L is analytic and that 
L'(z) = 1/z in U; the function L is then a primitive of 1/z in U. If, conversely, 
1/z has a primitive f(z) in a connected open U Cc C%, then (ef)’ = fief, 
so that the function g = ef satisfies zg! — g = 0 or (g/z)! = 0; since U is 
connected one has g(z) = cz for a constant c that we may assume equal to 1 
by adding a suitable constant to f. Consequently, f is a uniform branch of Log 
in U. To find a primitive of 1/z in U thus reduces exactly to constructing a 
uniform branch of Log in U. A “punctured” disc (i.e. with its centre removed) 
of centre 0 is the very type of open set for which the problem has no solution. 

An analogous problem arises when one wants to define the non integer 
powers of a complex number, i.e. the “function” z +> z* where s is an arbi- 
trary complex number. In view of the formula a* = e*:!°8* of Chap. IV, valid 
for a real > 0, it would seem natural to define 


(16.8) Piaerr ee 


but the ambiguity of Cog then transfers to the left hand side. If, however, 

one restricts to an open set U on which the multiform correspondence Log 

decomposes into uniform branches, the choice of such a branch LF yields a 

holomorphic function e*"(*) which, in its turn, is a “uniform branch of the 
sy, 


multiform function z*”; the latter is unique up to a constant factor of the 
form e?**'s, Tf for example U = C — R_, in which case one may choose 


(16.9) L(z) = log |z| + i. arg z with | arg z| <7 
as we have seen in Chap. IV, §4, one finds 
(16.10) gS jz ree) 


where |z|* = exp(s.log|z|) is the expression defined unambiguously in 
Chap. IV, n° 14 and where the argument is chosen as above. If for example 
$= s, one thus obtains two uniform branches, opposites, of z!/?. This type 
of problem arises frequently in the residue calculus @ la Cauchy. 
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17 — Periodic holomorphic functions 


The method used in n° 14 to obtain the expansion of a holomorphic function 
in power series applies equally to the Fourier series of periodic holomorphic 
functions. 

A function f defined and holomorphic in an open set U has a period a 4 0 
in U if f(z+a) = f(z) for any z € U. This clearly assumes that z € U implies 
z+a€U. By considering the function f(az) one reduces to the case where 
a=1. Then f(x +1,y) = f(x,y) for any «+ iy = z € U, which suggests 
expanding in a Fourier series with respect to x. The only reasonable situation 
is that where U is a horizontal strip 


(17.1) a <Im(z) <b 


of finite or infinite height, so that, for every y €la,b[, the function « + 
f(x,y) = f(x+ty) is defined on all R, is periodic, and C™ since f is analytic. 
We then have a much more than absolutely convergent expansion 


(17.2) f(a + ty) = Yo an(y)en(2) 


with 


(17.3) an(y) = p fle tye "de, 


the mean value over a period. Whence 


(17.4) an (y)e2"™”4 = $ fleet" de. 


We shall see that this integral is independent of y. 
Since the function 


(2) = f(2)e>n™? 


is as holomorphic and periodic as f is, it is enough to give the proof for n = 0, 
ie. to show that ao(y) = ¢ f(x + ty)dx is a constant. 

The theorem on differentiation under the [ sign (Chap. V, n° 9, Theo- 
rem 9) clearly applies to the function f(x,y). Thus, using the Cauchy differ- 
ential equation, 


ai(u) =  Dafle.yde =i f Difle.u)de: 


by the FT, this integral is the variation of the function +> f(x,y) over a 
period interval. Consequently, aj(y) = 0, qed?®. 


?8 Most authors employ the Cauchy integral around a rectangular contour to ob- 
tain this quasi trivial result; the method adopted here extends to the periodic 
solutions of many other partial differential equations than Di f = iDof, and in 
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Now we have a,(y) = ane-?"™ with a constant an, which puts (3) in the 
much more pleasant form 


G75) gZ)= Slane" with ay = ¢ f(atiye 2" dz. 


We may differentiate the series term-by-term since to differentiate with re- 
spect to z amounts to differentiating with respect to x, which is allowed by 
the theory of Fourier series for C® functions (n° 9, Theorem 5). 

The series converges normally on every closed strip c < Im(z) < d 
contained in U, i.e. such that a < c < d < 0b. In such a strip, we have 
[eee = eo 2™Y < e727Ne 4 e-27nd since the monotone function e~27”Y 
lies between its values at c and d on [c,d]. The general term of (5) is thus 
bounded by |an| e727" +|an| e~2""4 on the closed strip in question, but since 
(5) converges absolutely for a < Im(z) < b and so for Im(z) = c or d, the two 
series )*|a,|e7?""° and S> |a,| e~?""¢ converge, so their sum does too. We 
thus obtain a series independent of z which dominates the series (5) in the 
closed strip c < Im(z) < d: whence normal convergence. In conclusion: 


Theorem 14 (Liouville). Let f be a holomorphic function of period 1 in 
an open strip B: a <Im(z) < b. We then have a series expansion 


f(z) = oe ager 


which converges normally on every closed strip B' C B. The coefficients an 
are given by the relation 


ct+1 
= $ feed = / f(a ai iy)e 272+) dy 


for arbitrary y € ja,b| andc € R. We may differentiate the Fourier series 
of f term-by-term any number of times. 


Conversely, if a complex Fourier series, i.e. of the form (5), converges 
absolutely in a strip a < Im(z) < b, the preceding argument shows that the 
series converges normally on every closed strip (and so on every compact set) 
contained in the given open strip. Theorem 17 below will show that the sum 
of the series is analytic. 


18 — The theorems of Liouville and of d’ Alembert-Gauss 


We can now establish the theorem of Liouville to which we alluded a pro- 
pos the differential equation for the function » of Weierstrass (Chap. II, end 


fact Fourier himself, Poisson, Liouville, etc. applied it to the PDEs of physics 
known at their time — propagation of heat, the wave equation, etc. — whose solu- 
tions are not holomorphic functions of (x,y). Exercise: find the general form of 


the periodic solutions in t of the equation f;; — f/,, = cf, where c is a constant. 
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of n° 23). In its enunciation an entire function is, by definition, a holomorphic 
function on all C: a polynomial, an exponential function, exp(exp(exp(sin z))), 
etc. 


Theorem 15 (Liouville). Let f be an entire function such that 
(18.1) f(z) = O(2?) when |z| —> +00, 


where p is an integer > 0. Then f is a polynomial of degree < p. In particular, 
a bounded entire function is constant. 


By Theorem 10 one has an expansion f(z) = )>a,z” valid for any z. The 
relation (14.10) then shows that, for every n, 


(18.2) la, |r” < My(r) 


where My(r) is the upper bound of |f(z)| on the circumference |z| = r, or, 
equivalently, by the maximum principle, on the disc |z| <r. So assume that 
|f(z)| < M|z|? for every z large enough. It follows that My(r) < Mr?, hence 


lan | < Mr?-” for r large, 


whence a, = 0 for every n > p since then the right hand side tends to 0 at 
infinity, qed. 

One of the most famous and simplest applications of Theorem 13 is a 
proof (Gauss found four) by the same Liouville of the miraculous”? 


Theorem 16 (d’Alembert-Gauss). Every algebraic equation of degree 
> 1 has at least one complex root. 


To see this, consider the function f(z) = 1/p(z) where p is a polynomial 
not vanishing anywhere. Since p(z) is analytic in C, so also is f (Chap. II, 
n° 22, Theorem 17 — one may also invoke holomorphy, easier to prove for 
1/p). Now, at infinity, p(z) is equivalent to its term of highest degree, say 
az’, so that 

f(z) ~ L/az” = O(z7"). 


Since r > 0, f is bounded at infinity, so is constant by Liouville, impossible 
if p is of degree > 0. 


Liouville’s theorem allow us to complete the proof (Chap. III, n° 23) of 
the differential equation 


29 The complex numbers were invented to calculate the roots of equations of the 
third degree with the help of formulae involving square roots of negative numbers. 
The “miraculous” nature of the d’Alembert-Gauss theorem is that it allows us 
to ascribe roots to equations of any degree and even though, for n > 4, no 
one has ever discovered, or ever will discover, algebraic “formulae”, simple or 
complicated, to calculate the roots of a general equation of degree n. 
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@'(z)? = 49(z)? — 20a2@(z) — 28a4 


of the Weierstrass function g. As we said then, it is obvious, if one calculates 
a la Newton, that the difference f(z) between the two sides is a doubly 
periodic function analytic on a neighbourhood of z = 0; it is thus so in all 
C because its only singularities must be those of the function ¢, namely the 
periods w; in other words, f is an entire function. But, just as the function 
sin x takes all its values over R on (0, 27], by its periodicity, likewise a doubly 
periodic function takes no other values in C than those on the parallelogram 
constructed on the fundamental periods w’ and w”, i.e. on the compact set 
of the points 
z=u'o! tuo" with w’,u” € [0,1]. 

Being continuous, an entire elliptic function is bounded on such a parallelo- 
gram, so on all C, so is constant by Liouville. It remains to state that, in the 
present case, the function f is zero for z = 0, as is obvious from the series 
expansion of the Weierstrass function ¢. 

In fact, it was @ propos the elliptic functions that Liouville found his 
theorem in 1843-44 and it is instructive to follow the evolution of his ideas 
on this point®° since they developed following a logic opposite to that, now 
classical, which we have just expounded. 

Liouville first proves, using an idea of Hermite’s, that a nonconstant func- 
tion cannot have two real periods a and ( whose ratio is nonrational. To do 
this one writes (in our notation) that f(t) = )> ane(nt/a) and one checks, 
on replacing t by t+ 6, that a, = a,e(n3/a); if B/a € Q, the exponential is 
# 1 for any n £ 0, qed. The result had already been proved in another way 
by Jacobi, and Liouville reckoned that his proof was equivalent to “looking 
for difficulties where there were none”. 

But he then had the idea of using the same method to show a priori 
that if a doubly periodic function, with periods whose ratio is not real, “does 
not become infinite”, i.e. is holomorphic everywhere in C, then it is constant. 
Despite the 40,000 pages of Liouville’s notes — for the most part not published 
during his life, nor later, and deposited in the Paris Académie des sciences —, 
we do not know really how he proceeded. All the same, Liitzen has recovered 
a note where Liouville writes that if a “function of x + /—Ily” admits an 
imaginary period w = a+ /—1), then it can be expanded in a Fourier series 
of the form (in our notation) 


(18.3) f(z) = ¥o anen(z/w) 


where e,(z) = exp(27inz): this is Theorem 14. Assume w = 1 to simplify; 
Liouville writes like us that f(a + iy) = >> an(y)en(x); from this he deduces 
that, for h € R, 


3° See Jesper Liitzen, Joseph Liouville 1809-1882, Master of Pure and Applied 
Mathematics (Springer, 1990, 884 p.), chap. XIII. 
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f(atht+iy) = oe an(y)en(h)en(x), 


and so 
an(y)en(x) = iG +h-+iyje,(h)dh 


and consequently that the left hand side “is a function of x + iy”, so must 
be of the form a,e,(x + iy) with constant a,, a rather weak argument ... 

Liouville made no reference to the holomorphy or analyticity of f(z); for 
him, in 1844, what mattered was that f should be a “function of x +./—Iy”, 
which perhaps means that there is an algebraic or analytic expression for 
f(z) involving only z. In fact, one sees him use Cauchy’s equation ff, = if, 
in another note of the same period, and the standard formula 


(6) ¢ f(t + tyen(a)de 


to establish a differential equation satisfied by the a,(y) and from them to 
deduce that they are proportional to exp(—27ny), which yields (3) directly, 
as we saw in the preceding n°. 

Liitzen does not tell us how Liouville deduces from (3) that an everywhere 
holomorphic doubly periodic function is constant, but this was surely as 
obvious to him as to us: if f has the periods 1 and w (nonreal), a case to 
which one may always reduce, one must have 


S 7 an€n(z) = S- an€n(w)en(z) 


and thus an, = Gnen(w), whence a, = 0 for n 0 since the relation e,(w) = 1 
requires that w € Z. 

To pass to the theorem for arbitrary entire functions Liouville used the 
theory of elliptic functions. If f is a bounded entire function and if y is an 
elliptic function (actually one of the Jacobi functions), then the composite 
function f[{y(z)] is again elliptic and, being bounded, can have no poles. It 
is therefore a constant, and consequently so is f (if one knows that y takes 
all possible complex values?!). This argument appears in a four line note 
which Liitzen reproduces on p. 543 of his book, the proof being shortened to 
a “Consequently, etc..” 

Liouville announced his ideas on the elliptic functions to the Académie?” 
in December 1844 and immediately had to face an offensive from Cauchy at 


3! On subtracting a constant, if necessary, it is enough to show that an elliptic 
function y always has zeros. But if this were not the case, the elliptic function 
1/p would be holomorphic everywhere, the poles of y included, so constant. 

32 where he entered in 1838 after a battle of which Liitzen provides us a particu- 
larly edifying summary. The first two hundred pages of his book, which report 
thoroughly on the social situation of the mathematicians in France at this pe- 
riod, are full of incidents of this kind, and might illustrate the African proverb 
according to which two (and a fortiori fifteen) male crocodiles cannot coexist 
in the same backwater. The most famous scientists at this period could accrue 
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the following meeting: the latter recalled how a year previously he had shown 
how his theory of residues allowed one to reconstruct Jacobi’s theory very 
easily (?); that in 1843 he had announced that if a function f(z) “always has 
a unique and determined value, if, further, it reduces to a certain constant 
for all the infinite values of z [which, apparently, means that f(z) tends to 
a limit when |z| — +00], then it reduces to this same constant when the 
variable z takes an arbitrary finite value ”; finally, applying this result to 
(f(z) — f(a)]/(z — a) for a fixed a, a function which tends to 0 at infinity if 
f is bounded and which, at z = a, remains “continuous” since f is differ- 
entiable, he deduces the general form of Liouville’s theorem: “Jf a function 
f(z) of the real or imaginary variable z always remains continuous [which, in 
Cauchy’s language, probably means: is everywhere differentiable in the com- 
plex sense], and consequently always finite, it reduces to a simple constant’. 
We are thus brought back to the quarrels about priority (already discussed, 
Chap. III, n° 10), an exercise much in vogue in the France of the period, 
and of which Cauchy was probably the historic champion in all categories. 
Liitzen thinks that Liouville knew the result before Cauchy, but even if so it 
is nevertheless the date of publication which counts and not the manuscripts 
which an historian may discover a century and a half later. The situation is 
not particularly clear ... 

This tendency of Liouville’s not to publish again led to problems here, 
@ propos the elliptic functions (i.e. meromorphic and having two periods with 
non real ratio). Between 1844 and 1847 he proved theorems which became 
the starting point for later expositions; essentially they consisted of charac- 
terising the elliptic functions through their poles and zeros, so avoiding the 
traditional calculations on elliptic integrals and Jacobi series; it all depends 
on the nonexistence of everywhere holomorphic elliptic functions. Example: 
let f and g be two elliptic functions and assume that both of them have sim- 
ple zeros and simple poles at exactly the same points; then f/g is everywhere 
holomorphic and elliptic, so constant. 

Liouville did not publish these results though he explained them in private 
to two young Germans, Carl Wilhelm Borchardt and Ferdinand Joachim- 
stahl; on returning to Germany, the first put his notes in order and sent 
copies to Liouville and to the latter’s two friends, Jacobi and Dirichlet. His 
ideas were then known beyond the Rhine; Borchardt published them in 1880 


three well remunerated posts (of the order of 6,000 F per annum, while a coach 
at the X or in another école had to be content with 100 to 150 F per month): Sor- 
bonne, Collége de France, Polytechnique, CNAM, Bureau des Longitudes, etc. 
Just imagine the competition. The system considerably reduced the chances of 
the scientists who had not yet acquired high standing obtaining a suitable post, 
and, because of this, was strongly criticised. Moreover, when the polytechnicien 
Liouville, after resigning from the Corps des Ponts et Chaussées, found himself 
in this situation, and had, at the start, to teach nearly forty hours per week in 
secondary public or private establishments and at the X he claimed he had no 
more time to do research ... For another very different example, see Maurice 
Crosland, Gay-Lussac: Scientist and Bourgeois (Cambridge UP, 1978). 
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in the principal German journal when he became its editor. In 1851, having 
been elected to the College de France, Liouville, who clung to his priority 
against Cauchy, devoted a year to the subject — years are not very long at 
the College ... — before a small audience among whom were Charles Briot 
and Jean Bouquet, supporters of Cauchy who in 1859 published a Théorie 
des fonctions doublement périodiques, the first overall exposition of the the- 
ories of Cauchy and of Liouville; these had been completed elsewhere, since 
1844, mainly by Hermite, the first to use Cauchy’s ideas. But Liouville had 
not published, and, after Briot and Bouquet, refused to. In his later years 
he expressed his resentment of them’, “vile thieves but highly dignified Je- 
suits. Elected as thieves by the Académie!!!!”, underlined in the text. In 1876, 
when Liouville was elected a foreign member of the Berlin academy, Weier- 
strass energetically reestablished the truth, recalling that it was all already 
in Borchardt’s notes and that Briot and Bouquet ought to have mentioned 
that they owed all to Liouville. But nobody, Weierstrass included, ever un- 
derstood why he had not published; he had had fifteen years to do so before 
Briot and Bouquet. 

Nor had Cauchy yet discovered the Laurent series, and although he dis- 
covered equation (1.3) of n° 1 of this Chapter in about 1825 concerning 
holomorphic functions — without using the Fourier series which he was well 
placed to know —, he seems to have forgotten the result and it was only in 
1831-32 and more probably in 1840-41 that he discovered the analyticity 
of his functions as we said above, with his ideas beginning to clarify about 
1850. As Hans Freudenthal has written in his excellent biography in the 
DSB, “he would have missed much more if others had cared about matters 
so general and so simple as those which occupied Cauchy”. His works are 
confused, repetitive, with invalid or absurdly complicated proofs, yet never- 
theless he produced, at the final count, a formidable branch of analysis and 
a method of genius for obtaining integrals which nobody had known how 
to calculate before him. It is curious that they did not attract the atten- 
tion of his contemporaries, mainly of Germans like Gauss*+ or Jacobi who, 
at the same time, manipulated analytic functions every day for years (but 
perhaps without ascribing any importance to analyticity, since they rarely 
encountered anything else), beginning with elliptic functions; Cauchy, using 
his own methods, provided in 1846 the first nonmiraculous explanation of 
their double periodicity, which had been obtained by Abel and Jacobi using 


33 Liitzen, p. 201. It should be understood that Liouville was a republican and 
secularist. He was deputy for Toul in the first National Assembly elected after 
the revolution of 1848. His friend Dirichlet said for his part at the same period 
that every mathematician had to be a democrat, probably because it is neither 
necessary nor sufficient to have inherited a title or a fortune to be able to do 
mathematics. One is not even always to be encouraged to inherit mathematical 
ability from his father, or, let us be politically correct, from his (or her) mother. 

34 Tn fact, it seems that Gauss had discovered some of Cauchy’s results before the 
latter, but, as was his wont, had not published them. 
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complicated addition formulae already established for real variables. It was 
Riemann (1851) and above all Weierstrass and his pupils who then placed 
the theory on a solid base. Riemann’s works on algebraic functions®°, an ex- 
traordinary mixture of topology, algebraic geometry and complex analysis, 
were so far in advance of the time that it needed a full fifty years before peo- 
ple began to understand and then generalise them, without ever trivialising 
them. In the meantime, Cauchy’s theories, which Briot and Bouquet spread 
in Germany, and Weierstrass’, prospered prodigiously; Remmert notes that 
the German edition of a book by the Italian G. Vivanti cites 672 titles before 
1904. So prodigiously that, in France before 1940 to mention only one case, 
it monopolised the attention of a large number of mathematicians at the 
expense of the new branches which were being developed elsewhere, includ- 
ing the much more difficult theory of the holomorphic functions of several 
variables from which there came the most spectacular progress after 1950, 
mainly in France (H. Cartan and J.-P. Serre) and in Germany (H. Behnke, H. 
Grauert, R. Remmert and K. Stein); but this required totally different meth- 
ods — differentiable varieties, algebraic topology, functional analysis, etc. — 
and entirely new ideas, the road from one to several complex variables being 
far too long to reduce to a simple generalisation. 


Since we have just spoken of Liouville in a chapter that mainly treats 
Fourier series, we should mention his discovery, along with the Genevan 
Charles Sturm, of a formidable generalisation of harmonic analysis; it con- 
sists of replacing the exponentials by the “eigenfunctions” of a differential 
operator satisfying given “boundary conditions”. 

In Sturm-Liouville theory one considers a differential equation of the form 


(18.4) —a"(t) + q(t)x(t) = 0 


on the compact interval J = [0,1] where gq is a given function, real and con- 
tinuous in J. Putting La = —«2” + qa (cf. the notation Dx for the derived 
function x’), one terms eigenfunctions of L the nonzero solutions of the equa- 
tion 


(18.5) La(t) = Ax(t) 
where \ = p? is a given constant; cf. the eigenvectors of a matrix or of a 
linear operator in R”. In the trivial case Lx = —2x” one finds the functions 


a. exp(tpux)+b. exp(—ipx) where a and b are arbitrary constants. The problem 
then consists of studying the solutions of (5) which satisfy the boundary 
conditions 


35 A function ¢ = f(z) is said to be algebraic if one has a relation P(z,¢) = 0, 
where P is a given polynomial with complex coefficients. The first difficulty is 
that, for z given, the equation provides several possible values for ¢. We are not 
dealing with functions on C in the strict sense of the term, but with “multiform 
functions” in the sense of Chap. IV, § 4 or correspondences in the sense of 
Chap. IV whose graphs are, except in a few respects, the “Riemann surfaces” of 
Chap. X. 
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(18.6) x (0) — ux(0) = 2 (1) + va(1) =0 


where u and v are given real constants. The numbers X for which (5) and 
(6) have a solution « 4 0 are now called the eigenvalues of the “boundary 
problem”. All this came, chez Sturm and Liouville, from the PDE of heat 
propagation from which Fourier had already derived his series. 

A first remark (Sturm) is that if « 4 0, the eigenvalue X is real. By 
calculating the scalar product of La and x on J, one has, in telegraphic style, 


(La|2) = poor foto= f aer- [2’(1)@(1) — x'(0)2(0)| + [ov = 


: all? + i, le! [? + ule(0)/2 + vle’(L)/2, 


I 


a real result, since u, v and q(t) are real. But since Lx = Az, the left hand 
side reduces to A f |x|”, whence \ € R and A is even > 0 if the function q has 
positive values as well as u and v (an hypothesis justified by the physics). 
These are the same calculations as in algebra to show that the eigenvalues of 
a hermitian matrix are real. 

A second problem is to show that, ignoring the conditions (6), the equa- 
tion (5) always has solutions, and even a solution for which the initial con- 
ditions 
(18.7) x(0) =a, x'(0) =b 


are given. By replacing q(t) by g(t) — A one reduces to the equation x” = qu. 
Liouville then remarked, as Liitzen (Chap. X, p. 447) tells us, that if one 
considers the differential equations 


(18.8) eo 0, x{ = quxo, vy = qUi,---, 
then the function 


(18.9) x(t) = xo(t) + 21 (t) + eo(t) +... 


manifestly satisfies the equation x” = qz: differentiate the series term-by- 
term; Liouville does not worry himself, at least at the beginning, with jus- 
tifying this operation: Weierstrass (1815-1897) did not yet reign supreme in 
the 1830s and one might still, despite Cauchy, or because of Cauchy and his 
errors, calculate almost as Euler did. If one imposes the conditions 


(18.10) xo(t) = a+ dt," t= a (OC) 0 forme h, 


it is clear that the series (9) satisfies (7). But by the FT the conditions (10) 
impose, for n > 1, the relation 


(18.11) alt= f an(eyat = [oa f one "(t)dt = [au [aw t)tn_—1(t)dt 
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where, like Liouville, we have violated the ban on denoting both a phantom 
and a free variable by one and same letter. Whence an impressive formula, 
written in an entirely explicit way by Liouville, 


(1812) 2=a9+ f fart ffaf favor ffaffaf favor... 


where, for the value t of the variable, the integrals®® are taken between 0 
and t. Of course one has to prove that this series converges, which Liou- 
ville did perfectly correctly, by separating (12) into two series corresponding 
respectively to x(t) = a and to xo(t) = bt; modernising his language and 
putting M = sup |q(t)| = |lq||, one has®”, for xo(t) = a, 


je] <M fat f \xo(t)| dt = lale® 
|xo(t)| < M fat [ \x(t)\dt < |ala74!4), 


IA 


and more generally |a,(t)| < |a|M”t?"), whence convergence; the calculation 
is similar for g(t) = bt. We used similar calculations in Chap. VI, n° 10, to 
demonstrate the existence of solutions of the Bessel equation by means of the 
method of successive approximations; this fits into Liouville’s schema, apart 
from the fact that it takes place on the interval ]0,+co[ with a function q 
singular at the origin. These calculations of Liouville’s show that it is to him, 
and not to Emile Picard (1890), that one one should attribute the invention of 
this method as Liitzen justifiably notes; Cauchy had another method a little 
later and would in his turn adopt the method of successive approximations 
to majorise the solutions, if not to prove their existence. 

Having done this, one has to return to the boundary conditions (6), which 
impose drastic restrictions on the solutions. The first fundamental results are 
those of Sturm and rely on extremely ingenious arguments; assuming u, v 
and q > 0, and then \ > 0 and wp real, he shows that the problem (5), 
(6) has a countable infinity of eigenvalues 41 < Ag < ... [to do this he 
compares the solutions of x” = (q — A)x with those, trigonometric, of the 
equation x” = —n(A)?x where the constant n(A) satisfies n(A) < A — q(x) 
for every x], that to each eigenvalue A, there corresponds, up to a constant 
factor, exactly one eigenfunction u,,(t), that one may assume it real, that 
these are orthogonal on the interval I, ie. that [up,ug = 0, and finally that 
Un has n zeros which interlace with the n—1 zeros of uy,—1. Liouville, himself, 


36 Tf one denotes by P the operator which associates the primitive which vanishes 
at 0 to every continuous function on [0,1], the relation (12) means that 


x = 20 + P’qro + P’qP’qt0 + P’qP’qP’qa0 +--., 


where P? = Po P and where each operator P? applies to all that follows it. 
37 The operator P transforms the function t!”! into +4, 
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obtained an asymptotic evaluation of the A, and, above all, showed that the 
Un allow one to expand arbitrary functions in “Fourier” series 


ft) = S- Cn( f)Un(t) with cn = (f|un)/(un | tig) 
the purpose of the denominator being to give the ‘ 
as with the functions e,,(¢) in Fourier’s theory. 

All this, invented by Sturm and Liouville (and even published ...) in 
the years 1830-1840, was a half-century ahead of its time. The question was 
taken up again from 1880-1890; the proofs were corrected; the method was 
extended to noncompact intervals (example: the Bessel equation of Chap. VI), 
which is noticeably more difficult and requires expansions in “Fourier” in- 
tegrals involving the eigenfunctions; we enter the framework of the theory 
of integral equations — they had already appeared chez Liouville — then of 
Hilbert spaces, etc. This theory has given rise to quite remarkable expansions 
even very recently (scattering, the Korteweg-de Vries equation). The Soviet 
school, particularly B. M. Levitan, has worked enormously at this subject for 
a full half-century°®. 


‘vector” Up the length 1 


Example. Suppose g = 0, a case that seems trivial . Then (5) can be written 
as x” + px = 0, whence x(t) = ae’ + be~*“*. The relations (6) can be 
written 


iu(a — b) — u(a + b) = ip (ae — be“) + v (ae + be“) = 0, 


whence two linear homogeneous equations to determine a and b up to a 
constant factor. One can have (a,b) 4 (0,0) only if the determinant 


djs — Uy —ipb— U : 2 ie a : fo Nog 
(iptvje™ (-iutuje"# |~ (iu — u)(v —ipje™™ + (ut ip)(v t+ ipje™ 
is zero; on putting z = (ip — u)(v — ip) this can be written 


This result is of modulus 1, so that pw is real. On separating the real and 
imaginary parts we see that js must satisfy the relation 


et 2/2 Ser 2 “or eS 


cos = + (p? — wv) / (u? + uP)? WP + 0)”, 


38 For a remarkably clear resumé of the state of the question, see the articles 
on “Sturm-Liouville” in the Soviet encyclopedia of mathematics (Encyclopae- 
dia of Mathematics, Reidel, 10 Vol., 1988-1994, the Soviet edition dating from 
1977-1985) where one can, more generally, inform oneself on almost every ques- 
tion in mathematics and find a bibliography of the subject (completed by the 
translators). Amusing detail: the article “cryptology” is entirely the work of the 
translators; the Soviet editors had omitted it. They also forgot to credit some 
out-of-favour colleagues ... 
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a “transcendental” equation, as one said at the time, whose roots are the 
eigenvalues sought. For ys large, the right hand side tends increasingly to 1, 
so that one can obtain an asymptotic expansion of the roots of the equation 
by the method of Chap. VI, n° 7, an easy exercise; it is more difficult to 
prove “by hand” that the eigenfunctions allow one to expand arbitrary, say 
C', functions in series. For u = v = 0, i.e. for the boundary conditions 2’ (0) = 
x'(1) = 0, one finds a = b and cos uy = +1, and so p = mn; the eigenfunctions 
are the functions cos ant and one recovers Fourier series properly called. 


19 — Limits of holomorphic functions 


One of the most remarkable aspects of the theory of holomorphic functions 
is that when a sequence of such functions converges in only quite a mild way 
(convergence in the sense of the theory of distributions suffices) then (i) the 
limit function is holomorphic, (ii) the derived sequences converge, (iii) the 
successive derivatives of the limit are the limits of the successive derivatives 
of the given sequence; Paradise. Since the holomorphic functions are analytic 
and so C™ as functions of x and y, Theorem 23 of Chap. III, n° 22 will serve 
us so long as we prove point (ii) — convergence of the derivatives — which then 
allows us to apply it, as we have seen, since the Cauchy condition f; = if, 
extends trivially to the limit. 
The proof rests on a lemma worth isolating: 


Lemma. Let r and R be two numbers such that0 < r < R < +00 and let 
k be a positive integer. There exists a constant M;(r, R) such that, for every 
function f, continuous in |z| < R and holomorphic in |z| < R, one has 


(19.1) sup | f(z) 


lz|<r 


< M,(r, R). sup |f(z)|- 
leI<R 


We actually proved in n° 1 [pass to the limit in (1.11) as r — R] that the 
derivatives of f are given by 


(19.2) = yah f f( any Gn aen m(u) for |z| < R. 
For |z| < r one has |Ru — z| > R—r, whence 
|Ru/ (Ru — zi SPR eyes 


On substituting in (2), one obtains (1) immediately, with M;(r,R) = 
kIR/(R—r)***, ged (One could, in the right hand side of (1), replace the 
sup extended over |z| < R by a sup extended over |z| = R, but these sup are 
the same by the maximum principle). 

If one writes D and D’ for the concentric closed discs of arbitrary centre a 
and radii R and r < R, and if one considers a function f, continuous in D 
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and holomorphic in the interior, the lemma above, applied to the function 
f(a+z), shows that for every k there exists a constant M;,(D’, D) independent 
of f such that 


(19.1’) F< 4") Illa. 


This crucial point established, suppose that in an open subset U of C we 
have a sequence of analytic functions f,,(z) which converge to a limit f(z) 
uniformly on every compact Kk C U. For every a € U, let us choose the 
closed discs D C U and D’ C D with centre a as above, and apply (1’) to the 
differences f, — f, whose usefulness Cauchy has shown us. His convergence 
criterion tells us that the right hand side of (1’) is < € for p and gq large 
enough since the f, converge uniformly on the compact set D. Likewise for 
the left hand side. Consequently, the f converge uniformly in D’, i.e. on a 
neighbourhood of a. This means that the i converge uniformly on every 
compact K C U since this mode of convergence is a property of local nature 
(Chap. V, n° 6, Corollary 2 of Borel-Lebesgue). 

We may now return to the general arguments of Chap. III, n° 22: since the 
successive partial derivatives of the f,, are, up to powers of i, identical to the 
derivatives fi) in the complex sense, the limit function is C'°° and its partial 
derivatives, being the limits of those of the f,, satisfy the Cauchy condition 
Dof = iD, f like them. The limit function f is therefore holomorphic in U 
and we have f(*)(z) = lim f{O (2) for every k. Whence finally the famous 


Theorem 17 (Weierstrass). Let (fn) be a sequence of holomorphic func- 
tions in an open subset U of C. Assume that (i) lim fr(z) = f(z) exists 
for every z € U, (it) the convergence is uniform on a neighbourhood of every 
point of U. Then the limit function is holomorphic in U and, for every k €N, 
the sequence of derivatives f(z) converges to f(z) uniformly on every 
compact K CU. 


Very clearly, one might state Theorem 15 in terms of series of analytic 
functions®®. If in particular such a series converges normally on a neighbour- 
hood of every point of U, then its sum is holomorphic, can be differentiated 
term-by-term, etc. 


Example 1. This is the case of the Riemann function ¢(s) = }>1/n* in the 
open set Re(s) > 1 where the series converges. For every o > 1, the series 
converges normally in the half plane Re(s) > o since there |1/n*| < 1/n?. 
The function is thus holomorphic in Re(s) > o for every o > 1, so in fact 


3° Weierstrass’ original proof (1841) in fact concerns series of power series and does 
not use the Cauchy integral formula of n° 1; it rests on a direct and elementary 
proof of the inequality (19.1), which allows him to apply his theorem on double 
series to a convergent series of analytic functions. The present proof is due to 
Paul Painlevé (1887); see Remmert, Funktionentheorie 1, Chap. 8, §§ 3 and 4, 
who sets out Weierstrass’ proof of (19.1) on one page, as simple as it is ingenious. 


332 VII — Harmonic Analysis and Holomorphic Functions 


in Re(s) > 1, and its derivative is given by ¢’(s) = — >> logn/n*. Recall 
(Chap. VI, n° 19) that in fact the ¢ function can be extended analytically to 
C — {1}, the point s = 1 being a simple pole. 


Example 2. The series 
mw.cotmz =1/z+ 225 1/(z? —n?) 


converges normally on every compact set K C C — Z as we have known a 
long while, so is holomorphic in C — Z; and the only reason that it does not 
converge normally on a neighbourhood of a point n € Z lies in the term 
22/(22 —n?) = 1/(z —n) + 1/(z +n): the series obtained by suppressing 
the term 1/(z — 7) converges unproblematically on a neighbourhood of n. In 
other words, on a neighbourhood of each n € Z, the function is the sum of 
the pole term 1/(z — n) and of a function holomorphic on a neighbourhood 
of n. It therefore has just a simple pole at n; it is a meromorphic function on 
all C, and the series can be differentiated term by term. 


Example 8. Consider similarly the elliptic functions a la Weierstrass of 
Chap. II, n° 23, the sums of the series }>(z—w)~* extended over the periods 
(k > 3), or, for k = 2, the modified series ¢(z). We have shown that these are 
analytic by proving by a routine calculation that, in every disc |z| < R, they 
are the sum of a finite number of terms, corresponding to the periods situ- 
ated in this disc, and of an explicitly calculated power series. The preceding 
theorem yields the result without any calculation since after subtraction of 
the exceptional terms in question, the series )>(z —w)~* converges normally 
in the disc |z| < R as we have already seen. The functions obtained are thus 
holomorphic in C with the periods removed. In the neighbourhood of a period 
these functions are the sum of a holomorphic function on a neighbourhood of 
w and of the term 1/(z —w)* of the series, whence a pole of order k at each 
point of the lattice. The Weierstrass functions are therefore meromorphic in 
C and one may differentiate them term-by-term, which confirms, but without 
calculation, the fact that the series )>(z—w)~" are, for k > 3, the successive 
derivatives of ¢(z), up to obvious constant factors. 


Example 4. The sum of a Fourier series 5>a,en(z) which converges in a 
strip a < Im(z) < 6, and thus normally in every smaller closed strip, is a 
holomorphic function. 


20 — Infinite products of holomorphic functions 
Theorem 17 has an analogue, also due to Weierstrass, for infinite products: 


Theorem 18. Let (un(z)) be a sequence of functions holomorphic in a do- 
main G and suppose the series )\un(z) is normally convergent on every 
compact K C G. Then the function 
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p(z) = II (1 + un(z)) = lim (1 + u1(z))... (1 + un(z)) 


is holomorphic in G, its zeros are those of the functions 1+ un(z), and 


(20.1 Y@)/o2) = ae 


at every point where p(z) # 0. 


First we remark that, for every compact K C G, the series > ||un||j¢ 
converges, by the definition of normal convergence (Chap. III, n° 8). Thus 
lUnlla < 4 for n large, so that the factors 1+ up(z) which might vanish 
somewhere in K are finite in number; such a factor can moreover possess 
only a finite number of zeros in K, for otherwise Bolzano-Weierstrass would 
allow us to construct a sequence of pairwise distinct zeros converging to a 
point of K, thus of G, and the principle of isolated zeros (Chap. II, end 
of n° 19 and n° 20) would show, since G is connected, that 1+ un(z) is 
identically zero, a case that one may reasonably exclude in the considerations 
which follow. Apart from these factors whose product is holomorphic in G, 
the function p is an infinite product all of whose terms are # 0 for any 
z € K; since > |un(z)| < +00, this product is absolutely convergent and not 
zero (Chap. IV, n° 17, Theorem 13 whose proof we in any case will have to 
reproduce). 

Let us put pp(z) = (1+ u1(z))...(1 + un(z)) and remain in K, forgetting 
to allow for the factors, finite in number, which vanish in kK. We have 


(20.2) log|pn(z)| = log|1+u1(z)|+...+log|1+ un(z)| < 
S fur(z)| +--+. + lun(Z)] S [eal +--+ Munllic - 


The series > ||un||, being convergent there exists a finite constant M(x) 
such that |p,(z)| < M(K) for every z € K. Since we have pn4i — pn = 
PnUn+i, it follows that |pr41(z) — Pn(z)| < MUX) |un+i(z)| for every z € K, 
whence 

IPnta — Palle S M(K) lltnsallic- 


The series $0 pn41(z) — Pn(z) therefore converges normally in K like the 
Un(z). We deduce that the sequence of the p,(z) converges uniformly on 
every compact K C G, and therefore p(z) = limp,(z) is holomorphic in G 
like the p,, by Theorem 17; the “forgotten” factors in the product do not 
affect the conclusion since they are holomorphic and finite in number on a 
neighbourhood of each point of G. 

It remains to prove (1), a generalisation of the rule 


(f9)'/fo=f/f+49'/9 


for “logarithmic differentiation” of a product. At a point where p(z) 4 0, one 
has from the latter p/,(z)/Dn(z) = open Ue (Z)/ (1 + ux(z)). But Theorem 17 
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assures us that lim p/,(z) = p’(z), and since p,(z) tends to p(z) # 0, one 
deduces that p'(z)/p(z) is the limit of the partial sums of the series (1). 

In fact, the latter converges normally on every compact K C G. It suffices, 
as always, to show this on a neighbourhood of every a € G. To do this, 
consider a compact disc D : |z—a| < R contained in G and a disc D’ : |z—a| < 
r < R contained in the interior of the first. The lemma of n° 19 provides an 
upper estimate |lu),||p, < M |lun||p with a constant M independent of n. 
On the other hand |lun||) < 4 for n large, as we have seen above, and 
thus ||1 + Un||p > 4. The uniform norm on D’ of the general term of the 
series (1) is therefore majorised for n large by 2M ||un|| 5, whence the normal 
convergence of (1) in D’, and so on every compact K C U, qed. 


Example 1. Consider (Chap. IV, n° 20) Euler’s formula 


which appears in the theory of partitions. Theorem 18 shows that the product 
[[ (1 — z”) is holomorphic in the disc D : |z| < 1 and never vanishes; the left 
hand side is therefore also holomorphic in D, whence the existence of an 
expansion in power series in D. Since we have p(n) > 1 (this is the least one 
could say ...), the radius of convergence is equal to 1. 

The function P(z) is an example of a curious phenomenon: it is not possi- 
ble to “extend the function P analytically” outside D: if an analytic function 
defined in a domain G D D coincides with P in D, then G = D. One may 
understand this by observing that P(z) seems not to tend to any limit when 
z tends to any root of unity, since then infinitely many factors of the product 
become infinite, but this is not a proof... 


Example 2. Let q be a constant complex such that |g| < 1 and let us consider 
the infinite product 


(20.3) fi) =[[ a+), 


where the product is extended over all n > 1. Since }>|g”| < +00, Weier- 
strass’ theorem applies, the result being an entire function of z. It is clear 
that f(z) = (1+ ¢z)f(qz), so that the power series f(z) = S>anz” which 
represents f in the whole plane satisfies 


y Anz (1 + qz) >> Ang’ z” 
one deduces that a, = g”dn + g”an-1, 1.e. that 
(1 =) q”) an = Gq" An—1- 


Since ap = f(0) = 1, it follows that 


§ 4. Analytic and holomorphic functions 335 


an = Pt" (1 —g)...(1-4@"), 


whence the identity 


(20.4) Lee res ae z” — (\q| <1, z€C). 
n=1 ree 


Exercise: show that 


Co Co e 


(20.5) [[ a+r" mag A a 2p) 


n=0 = 


for |g| <1, |z| <1. What happens for |z| > 1? 


Example 3. Consider the infinite product 


(20.6) f(z) =z] [ G-2?/n’) 


extended over n > 1. It satisfies the hypotheses of the theorem, with G = C, 
so represents an entire function having simple zeros at the n € Z and is 
nonzero elsewhere. Theorem 18 shows that 


(20.7) f’(z)/f(z) = 1/ze+ S- 22/ (2? —n?) = x. cot mz = (sin z)'/sin mz. 


In the connected open set C—Z where it is defined, the holomorphic function 
f(z)/sin z thus has derivative zero, so is constant; since f(z)/z and sin rz/z 
tend respectively to 1 and a when z tends to 0, this constant is equal to 1/7. 
Whence 


(20.8) sintz = nz |] (1— 27/n?). 


This proof of Euler’s infinite product manifests the fantastical character 
of his considerations on “algebraic equations of infinite degree” (Chap. II, end 
of n° 21). As we have said, the infinite product (6) is an entire function whose 
only zeros are the n € Z; on a neighbourhood of such a point, f(z) is the 
product of 1—z/n by an infinite product which no longer vanishes at n, so that 
the n € Z are simple zeros of f. Since it is clear that z = n is likewise a simple 
zero of the function sin zz (obvious for n = 0, so for any n by periodicity), 
one deduces that sinaz = g(z) f(z) where g is an everywhere holomorphic 
function (so analytic) having no zero in C. For such a function, the quotient 
g'(z)/g(z) is again an entire function, so an everywhere convergent power 
series, so has in C a primitive h(z) such that h’(z) = g/(z)/g(z) as we know 
(Chap. II, n° 19). It follows that 


(ge~")' = ge" — gh'e“* =0, 
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so that g(z) = ce!(*) where c is a constant that one may assume equal to 1 by 
incorporating a suitable constant into h. Euler’s argument, corrected, thus 
proves the existence of a relation of the form 


(20.9) sin rz = ez I[G — 2°/n?) 


with an entire function h about which the preceding argument provides no 
information whatsoever ... 


In fact, Weierstrass invented, and his successors refined, a whole theory 
that allows one to represent any entire function f by an infinite product that 
exhibits its zeros, but this is much less simple than Euler’s ideas. The first 
idea is to order the zeros a, of f in a sequence*? such that |an| < |an+1|, 
then to consider the infinite product [[(1—z/a,). This then has exactly 
the same zeros as f, with the same orders of multiplicity, whence f(z) = 
g(z) [[ (1 — z/an) where g is an entire function without zeros, so of the form 
e(), This is Euler’s marvellous argument (except that he forgot the factor 
9): 

But first one should verify the convergence of the infinite product! Though 
obvious in the case of the function sinus when one groups the symmetric 
factors, it can be perfectly false in the case of an arbitrary entire function*! 

Weierstrass’ idea is now to multiply each factor 1 — z/a, by as simple a 
function as possible, vanishing nowhere so as not to add parasitic zeros to the 
product, and making the infinite product convergent. This technique is very 
shrewd. First, it is clear that, for z given, z/a,, tends to 0, so is in modulus 
<1 for n large. Consider, generally, 1 — z for |z| < 1. Then 


1— 2 =exp(—z- 27/2— 2°/3-...) 


(Chap. IV, (13.12)], whence, for every p, 
1—z=exp(z+...+2?/p)~* exp [—z?t*/(p+1)—...]. 

Consider the functions 

(20.10) E,(z) =(1—z)exp(z+...+2?/p) = exp [-2?1/(p+1)-...]. 


The factor exp (z+...+ 2?/p) never vanishes and tends to 1 when z tends 
to 0, as does E,(z). Let us prove that 


4° The set of zeros is countable, for, by the principle of isolated zeros, there can be 
only finitely many in the disc |z| < p for any p € N. In what follows, we assume 
that each zero appears as many times among the ap as its order of multiplicity. 

4’ Consider the function sin(mz?). Its zeros are the z such that z? € Z, so the 
points of the form n!/? 1/2 and the infinite product is divergent like the 
series > 1/|n|'/?. 


or in 
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(20.11) |1 — E,(z)| < |2|?** for Jz| <1. 


The function 1 — E,(z) vanishes at the origin and is holomorphic in all C; its 
derivative 


(20.12) —E,(z) = 2? exp(z+...+2?/p) = ey (2+...+ 2P /p) 


(exercise!) is an everywhere convergent power series with all its coefficients 
positive (Theorem 17) and whose term of lowest degree is z?; that of 1 — 
E,(z) therefore starts with a term in z?++. Schwarz’ lemma now shows that 
|1 — E,(z)| < M|z|?*', where M is the maximum of |1 — E,(z)| on the circle 
|z| = 1; but since the coefficients of the power series of 1 — E,(z) are, like 
those of its derivative, all positive, its maximum on |z| = 1 is attained for 
z = 1 and so is equal to 1 by (10); whence (11), thanks to Remmert. 

This point completed, let us return to the entire function f(z) and to its 
zeros ay. The function E, (z/an) = (1 — z/an) exp(...) has a simple zero at 
a, and is # 0 elsewhere. We may therefore try to compare f(z) with the 
infinite product 


(20.13) h(z) = [| 2, (d - 2/an) 


where the p, are chosen to make the product absolutely convergent. Since it 
can be written as [[ [1+ un(z)] with 


[tin (2)| < |z/an|Pr** 


by (11), and since, for every compact K C C, we have |z/a,| < 5 for every 
z € K if n is large enough (the |a,,| increase indefinitely since the zeros of 
an entire function are isolated in C), we may always choose the p,, to make 
the series }> u,(z) normally convergent on every compact; at the worst, we 
choose py, = n — 1 for every n. 

This done, we have, as in the case of the function sin 7z, a relation 


(20.14) f(z) =e [[ B,, (1 — z/an) 


with an entire function g(z) about which, a priori, we know nothing. 

It goes without saying that the choice pn, = n — 1 is not always the best 
possible, as shown by the case of the function sinus, and that, moreover, 
il would be useful to determine the function g(z) more precisely. There are 
theorems that apply to functions that do not increase too fast at infinity. To 
enter into this difficult subject which has interested (too) many specialists 
would no doubt exceed the capacity of most of our readers, and, even more 
certainly, of the author. 


Example 4 (infinite product for the Gamma function). Consider Euler’s func- 
tion 
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+oo 
(20.15) I'(s) = | ere de. 
0 


We already know some of its important properties: 


(i) the integral converges absolutely if and only if Re(s) > 0 (Chap. V, 
n° 22, Example 1), satisfies [(s + 1) = sI’(s), and also 


(20.16) sI'(s) =limn®/(1+s)(1+s8/2)...(14+ s/n); 


(ii) the function I" is holomorphic for Re(s) > 0 and can be continued 
holomorphically to G = C — {0,—1,—2,...} (Chap. V, n° 25, Example 5); 
in Chap. V, we did not yet know that “holomorphic” and “analytic” are 
synonymous, but we know this now; this shows in passing that the various 
methods we have used to continue the function I analytically to G yield the 
same function; 


(iii) one may (Chap. VI, n° 18) transform (16) into an expansion as an 
infinite product 


(20.17) 1/sP'(s) = eT] (+ s/n)e-*™ 


which converges absolutely for every s € C. 

If one only knows that (17) is valid for Re(s) > 0, it is easy to lift this 
restriction; it amounts to proving that Theorem 18 applies in C: the principle 
of analytic continuation will do the rest. Remmert, Funktionentheorie 2, p. 31, 
gives what is surely the simplest proof. One starts from the identity 


1—(1—w)e” = w? [(1 — 1/2!) + (1/2! — 1/3!)w + (1/3! — 1/4!w? +... .] 


and remarks that the coefficients of the w” are all > 0; for |w| < 1 one thus 
has 


JL — (L—w)e*| < Jol? S00 /p! — 1/(p + 1)4] = Iw. 


But if one puts the general term of the product (17) in the form 1 — u,(s), 
one has u,y(s) = 1—(1—w)e” for w = —s/n; consequently 


|un(s)|<|s/n|? for n > |. 


In a disc |s| < R, one thus has |u,,(s)| < R?/n? for n > R, whence the normal 
convergence of 5* un(s) in |s| < R, qed. 


Example 5. Let us go back to the theory of elliptic functions with a lattice L 
of periods (Chap. II, n° 23) and let us consider, with Weierstrass, the infinite 
product [](1 — z/w) extended over the periods w € L. It clearly does not 
converge since the convergence of }>1/|w|* presupposes that k > 3 for k an 
integer (or k > 2 for k real). But we have 
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l-z/w = exp (—z/w — 27/2w? —...) = 
= exp (—z/w—- z? /2w*) exp (—2° /3w®* —...) 


with [1 — exp (—23/3w3 —.. )| <M | 23 /3w| for |z/w| < 1 by (11); for |z| < 
R, this condition holds for |w| > R, whence a bound by MR3/3|w|? which 
ensures the normal convergence in the disc considered. We conclude that the 
infinite product (no connection with the Riemann function ¢) 


(20.18) Cr (z) = 2 T] (A — 2/wye/ote' see" 
wA0 
converges normally on every compact subset of C — L and even on a neigh- 
bourhood of every w € EL so long as we isolate the term 1 — z/w. In this way 
one finds an entire function having simple zeros at the w € LD and nonzero 
elsewhere. 
Applying the differentiation formula, one obtains a new bizarre function 


(20.19) o1(z) = ¢1(2)/¢r(2) = 1/24 >- [1/(z-w) + 1/w + 2/0? 


and, differentiating once again, 


(20.20) —o,(z) = 1/27 + ys [1/ (z-w)* - 1/w?| = ex (z), 


the gothic cursive function g of this same Weierstrass associated with the 
lattice L. The beauty of these calculations is that they are apparently purely 
formal; but, in reality, everything converges because Theorem 17, clearly 
applicable to unconditional convergence, justifies everything once one knows 
that the infinite product (18) converges. 

The relation o/,(z) = —gz(z) shows that the derivative of the function 
oy does not change if we add a period to z; hence a relation of the form 


or(z+w) =oz,(z) + cw) 
with a constant c(w) clearly satisfying 
c(w’ + w!") = c(w’) + c(w”’), 


whence c(niw1 + Nowe) = nic(w1) +n2c(w2), which allows one to calculate it 
if one knows c(w) for two periods forming a basis‘? of L. We could continue 
— the essentials of the theory of elliptic functions can be expounded with 
hardly any other tools than those of the present § —, but it will be better to 
delay these explorations for later (Chap. XII). 


42 i 6. two periods w; and we such that every other is a linear combination of them 


with integer coefficients. 
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§ 5. Harmonic functions and Fourier series 


21 — Analytic functions defined by a Cauchy integral 


The calculation which, in n° 1, allowed us to represent a power series con- 
verging on a disc |z| < R by an integral over a circle with centre 0 and 
radius r < R, can be inverted and generalised: every reasonable function f 
defined on the circle |z| = r allows one, thanks to Cauchy’s integral formula, 
to define a function Py, its Poisson transform (Siméon Denis, 1781-1840, a 
less brilliant competitor of Fourier and Cauchy, to whom, nevertheless, we 
owe several important ideas), defined and analytic for |z| 4 r. The study 
of this function, beyond its being an excellent exercise, allows us to prove 
Weierstrass’ approximation theorem again, and, more importantly, to estab- 
lish the principal properties of the “harmonic” functions, which are, at least 
locally, the real parts of holomorphic functions, and conversely. 

To simplify the formulae, we shall assume that r = 1 in what follows. We 
need only replace f(u) by f(ru) to obtain the general case. 

For a regulated periodic function f the function Pr is given by 


ee Ge . u)dm(u); 
(211) (2) = [£2 i flu)am(u); 


271 —2z U—z 


we have introduced a factor 277 so as to recover as Py the function f(z) if 
one chooses for f(u) the restriction to T of a power series, as in the case 
of Cauchy’s formula (1.4). Recall how, in (1), one passes from the complex 
Leibniz notation to the integral in u: one puts ¢ = u = e(t), whence d¢ = 
2rie(t)dt = 2riudm(u). 

One may generalise further and replace the expression f(u)dm(u) by a 
measure ys on T, whence the Poisson transform 


(21.2) Pyle) = f duu) 
U-z 
of w. If for example yz is the Dirac measure at the point u = 1, one obtains 
P,(z) = 1/(1 — z). One might use the same formula to define that of a 
distribution on T since the function u + u/(u—z) is indefinitely differentiable 
on the circle. We shall see that all these functions are analytic for |z| 4 1. 
For |z| < 1, we have, as in n° 1, 


u/(u— 2) =1/(1-utz) = See 


with a series of functions of wu which converges normally, so uniformly, on 
the interval of integration. We may therefore integrate term-by-term by the 
definition of a measure, i.e. thanks to the estimate 


(21.3)  f seaau(ay| < arc 
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valid for every function f defined and continuous on the circle. This done we 
clearly find 


(21.4) P,(z) = yi Qnz” where dy = frduta) =jfi(n) (jz| <1) 


n>0 


by the definition (3.1”’) of the Fourier coefficients of a measure or distribution 
on T. The series we obtain converges absolutely for |z| < 1: since |u| = 1 we 
have |an| < M() by (3), and the result follows since |z| < 1. In other words, 
f is analytic in the disc |z| < 1. 

The reader would be wrong to be overwhelmed by these vast generalisa- 
tions: we are dealing in trivialities, i.e. assertions following directly from the 
definitions, not to be confused with theorems which require more or less long 
and difficult proofs. 

Exercise: show that, for z given with |z| < 1, the series > z”/u” converges 
in the space D(T) or, equivalently, that the series }> z”e,,(t) (sum over the 
n > 0), and all those obtained by differentiating term-by-term ad libitum 
with respect to t, converge uniformly on R. Deduce that (4) applies to every 
distribution on T. 

For |z| > 1, we must, on the contrary, expand in powers of u/z, i.e. use 
the formula 


u/(u z) = u/z (1 uz') = =Soaurt yen, 


whence 


(21.5) P,(2) = > bp/2” where by = = [ wrau(e ee a ee 
n>0 


In this way we find an analytic function of 1/z, so also of z, on the open 
set |z| > 1. It generally has no connection with the function P,, obtained 
for |z| < 1; if for example one starts, as in n° 1, from the measure du(u) = 
f(u)dm(u), where f is a power series converging absolutely on T, then the 
function (4) is identical with f but (5) is identically zero by the formulae 
(1.4). No matter, finally we have 


Vin>o Hn) 2” for |z| <1. 
(21.6) P,(z) = 

= neoniije” tor |Z) ST, 
In the most important case, that of formula (1), the formulae (6) can be 
written 


Deer f(n)z™ for |z| <1 


U 


(21.7) Pr(z)= , (u)du = 


TU-Zz 


ae eee, f(n)z” for |z|>1 


since the Fourier coefficients of the measure du(u) = f(u)dm(u) associated 
with the function f are those of f. 
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22 — Poisson’s function 


Consider a continuous function f(u) on the unit circle T : |u| = 1, and as 
always put f(t) = f (e?""*) = f(e(t)); let us examine the function 


(22.1) Pa ¢ ©) saat = / Ue eeu Ci), 
e(t) — z UZ 

As we have seen above, this formula represents two different analytic func- 
tions in the open sets |z| < 1 and |z| > 1. It is by comparing their behaviour 
on a neighbourhood of a point u = e(t) of the limit circle T that we shall 
obtain results on the Fourier series of /. 

To do this we put z = ru with r ¥” 1, and let r tend to 1 either through 
values < 1, or through values > 1. 

If r < 1, we have, by (21.7), 


(22.2) P,(ru) = S~ f(n)ru". 


As r tends to 1, we then “clearly” find 


(22.3) lim _, Pr(ru) = f(n)u”. 


rol, r< 
n>0 


This passage to the limit is, alas, not always justified; since 


sup |f(n)r"u"| = |f(n)| 


r<l 


because all the exponents n featuring are positive, the series (2), considered 
for u fixed, like a series of continuous functions of r in the interval [0, 1], will 
be normally convergent if and only if one assumes that 


(22.4) > HO) < +00. 


n>0 


Passage to the limit term-by-term is then allowed by Theorem 9 of Chap. III, 
n° 8: the sum of the series is a continuous function of r on the closed inter- 
val [0,1], so that its value >> f(n)u® for r = 1 is the limit of its values when 
r <1 tends to 1. 


For z = r’u, with r’ > 1, we have to start from the formula 


(22.5) Py(r'u) =— > f(n)r’Pur. 


n<O0 


If 
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(22.6) S- \F(n)| < +00, 

n<O 
the preceding argument applies again since the fact that r’ is > 1 is com- 
pensated by the presence in (5) of all negative exponents: the series (5) is 
dominated, on the closed interval r' > 1, by the convergent series (6). Hence 
we find 


: bow 2 ce n 

(22.7) era u) = S- f(nju”. 
n<O 
If the hypotheses (5) and (7) hold, ie. if 
(22.8) S| f(n)| < +00, 
Z 
we then see that the Fourier series of f is given by 
Makan, wt 7 : 29 

(22.9) 2 fine = oe Py(ru) oo Sa Py(r'u). 


Since we hope that the left hand side has value f(u) = f(t), we have to 
examine the second more closely. We shall see that, if we choose to let r and 
r’ vary so that r’ = 1/r, the difference Py(ru) — P(r’u) is then expressed as 
a very simple integral which tends to f(u) if f is continuous; if the hypothesis 
(8) holds, we will thus have shown — without using the results of the § 2 — 
that f is the sum of its Fourier series. 

Since r’ = 1/r, we have r/” = r-" = r'"! for n < 0. By (2) and (5), then 


(22.10) — Pr(ru)—Pr(u/r) = SY f(n)riu" = 
Dritlun [or F(w\dn(v) = 
Xf rilure-" f(w)am(v) 


by the definition of the Fourier coefficients of f; we have written v for the 
variable of integration to distinguish it from the free variable u. Since the 
function f is regulated — it is unnecessary to assume it continuous for the 
moment — and since the series }>r!"!u"v-" = Yori” (uwv-1)" is, for r < 1 
and wu given, normally convergent on the circle |v| = 1, we may interchange 
the signs f and 5> in (10); putting 


(22.11) Hy(z) = S¢ f(n)rlu", 
(22.11’) P(z) Sle tora, Pe 


I 


I 


I 


(the series are extended over Z), we thus have 
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(29.12) Hy(z) = cm f(v)dm(v). 


On putting u = e(s) and v = e(t), whence uv! = e(t — s), we again obtain 


(22.13) Hy(ru) = f Plrels — t)| f(t)dt. 


These changes of notation are a translation exercise, passing from the point 
of view of “periodic functions on R” to the point of view of “functions on 
Ne 

The formula (12) is a convolution product on T, analogous to the one that 
allowed us to obtain convergence theorems for Fourier series, by Dirichlet’s 
method in n° 11, or in that of Fejér in n° 12. Likewise here: the functions 
uv + P(zv) allow us to approximate f with the help of (12) or (13) when r 
tends to 1. 

First, let us calculate the function P. For z= ru, r < 1, we have 


P(z) = Sore Sa ya So ur 
Z 


n>0 n>0 


1 
2: “pope. Vee eons (a. ee = | 
1l—ru 1-z 1l-z 


— 1-2? _ 1l—r? 


~ |l—2|2 1—2rcos2as+r? 


or again 


(22.14) P(z) 


for z= re(s). 


This formula demonstrates the dubious behaviour of P(z) when z tends to 1, 
and so that of P(zv~') when z tends to v. It shows moreover that, for every 
real function f on T, the function 


Utz 


(22.15) Hy(z) = pee) f(v)dm(v) = [re( ) #lo)am(o 


is the real part of a holomorphic function for |z| < 1, a trivial result. 


23 — Applications to Fourier series 


Assuming f regulated let us return to the formula 
(23.1) Ay(z)= es (zv—") f(v)dm(v). 


We shall show that if f is continuous at a point u € T, then H(z) tends to 
f(u) when z tends to u remaining in the disc D: |z| < 1. 

It is more convenient to replace z by zu, to make z tend to 1, and to start 
from the relation 
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(23.2) Hy(zu) = pew) f(v)dm(v) = pee) f(uw)dm(w). 


To apply the method of Dirac sequences (n° 5) to the functions u +> P(zu) 
it is enough to show that they are positive, have total integral 1, and that, 
when z — 1, the function P (zw-*) converges to 0 uniformly on every arc 
J:|w—1|>6 of T. 

The function P is clearly positive, by (22.14). To establish the relation 


(23.3) pee) dm(w) = [ P@eam(w) =1 for |z| <1, 


one observes that, for |z| < 1, the function w ++ P (zw7!) is an absolutely 
convergent Fourier series as (22.11’) shows; the integral (3) is then (Chap. V, 
n° 5) equal to the term n = 0 of the series, obviously equal to 1. 

It remains to verify uniform convergence on the arc J. Now 


(23.4) P(zw7t) = (1—|z2)?) /|1L— 2w74|" = (1-2!) /|Jz-—w?. 


Since the uniform norm of w ++ P(zw~') on J is the product of 1 — |z|?, 
which tends to 0 and does not depend on w, and of the uniform norm of w t+ 
1/|z- ww’, it is enough to show that the latter is, for z near to 1, majorised by 
a constant independent of z. But this is obvious since the relations |w—1| > 6 
and |z — 1| < 6/2 imply |z — w| > 6/2 and thus 1/|z — w|? < 4/6. 

In sum, the conditions (D 1), (D 2) and (D 3) imposed on Dirac sequences 
in n° 5 do indeed hold. The fact that our functions depend on a complex 
parameter z which tends to 1, rather than on an integer n which increases 
indefinitely, clearly does not change the proofs. Since we may also make z 
tend to 1 on the real axis, we finally obtain the following statement: 


Theorem 19. Let f be a regulated function on T. Then 


(23.5) f(u) = lim H;y(zu) = lim Sori! f(nju” 


jz|<1 r<l Z 


at every point u € T where f is continuous. If f is continuous on an open 
arc J of T then the limit (5) is uniform on every compact K C J when z or 
r tends to 1. 


Translation into the language of periodic functions: 


f(t) =lim Sor"! f(njen(t) 


at every point ¢t € R where f is continuous, and uniform convergence on R 
if f is continuous everywhere. We emphasis again that generally one cannot 
pass to the limit term-by-term in the series; if this were possible, as Poisson 
believed, every continuous function would be the sum of its Fourier series, 
which is not the case. 
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In this way we immediately recover Weierstrass’ theorem: every continu- 
ous and periodic function f is the uniform limit of trigonometric polynomials 
of the same period. The preceding theorem, with J = T, shows in fact that 
H (ru) converges uniformly on T to f(u) as r < 1 tends to 1. But the series 


Hy(ru) = Sorl"! f(n)u” is, for r <1 given, normally convergent on T, since 


|f(n)| < || ||. Its sum is therefore the uniform limit on T of its partial sums, 
which are trigonometric polynomials; in other words, one may approximate f 
uniformly by functions that one may approximate uniformly by trigonometric 
polynomials. Qed. 

We also recover the fact that, for a continuous function f of period 1 such 
that 


(23.6) > f(r ) < +00, 


one has 


(23.7) f(t) => flnjen(t) = > flnerr"" 


n 


for any t. The general term of the series )>r!"! f(n)u 
on the closed interval 0 < r < 1 by |f(n)|. Being a series of continuous 
functions of r for u given, this series is therefore normally convergent on this 
interval. Its sum is thus a continuous function of r on [0,1], so tends to its 
value > f(n)u” for r = 1 when r < 1 tends to 1; but it also tends to f(t) b 
Theorem 19, qed. 

We leave it to the reader to extend Theorem 19 to the general case of a 
regulated function, i.e. to show that 


is actually majorised 


(23.8) lim Hy(zu) = =[f(u+) + f(u-)] 


NlR 


for any wu. 
One may also deduce the Parseval-Bessel equality from the preceding the- 
orem, at the very least for the simple case where f is continuous. Since H (ru) 


converges uniformly to f(u) it is clear that |H p(ru)|? converges uniformly to 
| f(u)|?, whence, integrating, 


(23.9) / | f (uw) |? dm(u) = lim / |Hp(ru)|? dm(u). 


But as the Fourier series > f(n)r!"lu" of H (ru) is absolutely convergent for 
r <1, Chap. V, n° 5 shows, “without knowing anything”, that 


(23.10) jis ru)|” dm(u) = S>r?”l| f(n) 


When r < 1 tends to 1, the partial sums of the right hand side tend to 
those of the series }> | f(n)|?; now they are majorised by the left hand side of 
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(10), which tends to the right hand side of (9). We conclude that the partial 
sums, and so the total sum, of the series }>|f(n)|? are majorised by the left 
hand side of (9), whence the Parseval-Bessel inequality. But then the right 
hand side of (10), considered as a series of continuous functions of r on [0, 1], 
is dominated by the convergent series > |f(n)|?, so converges normally. We 
may therefore pass to the limit term-by-term (Chap. III, n° 8, Theorem 9 or 
n° 13, Theorem 17), whence the Parseval-Bessel equality using (9). 


Exercise — For f regulated we have 
lim / |H (ru) — f(u)|? dm(u) = 0 
(use Parseval-Bessel). 


24 — Harmonic functions 


The method of Fourier series applies to a class of functions closely linked to 
the holomorphic functions and which, historically, arose from mathematical 
physics (hydrodynamics, where d’Alembert had already written the Cauchy 
relations between the partial derivatives of a holomorphic function without 
having had the idea of going further, Newtonian potential, electrostatics, 
etc.) and transformed themselves in consequence, as always in such a case, 
into an occasion for the mathematicians to go very far beyond the needs of 
the users, and to generalise the situation. These functions are also linked to 
the Hy that we have just studied. 

Let f(z) = P(x,y) + 1Q(ax,y) be a holomorphic function on an open 
set U. The Cauchy differential equation f{, = —if; can then be written, on 
separating the real and imaginary parts, in the form 
(24.1) PL=Q P=-Q 


y? y av? 


which, since f’ = f! = P+ 7Q/,, shows in passing that 
(24.2) f(jy=P- aa = Q;, +iQ); 


in other words, the knowledge of P = Re(f) or of Q = Im(f) determines 
f’ and so determines f up to an additive constant. The function f, being 
analytic as a function of z and a fortiori C™ as a function of the real variables 
x and y, so likewise are P and Q, which allows us to differentiate the relations 
(1). A trivial calculation then shows that 


(24.3) AP= Pi, +Pi,=0, AQ=6, 


where 
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(24.4) A = D? + D3 = 67 /dx? + 8? /dy? 


is the Laplace operator which generalises in the obvious way to functions of 
any number of variables. A function*? H(z, y) of class C? in an set open U 
of C is said to be harmonic in U if it satisfies the relation AH = 0. One 
may ask whether such a function, assuming it real valued, as we shall do in 
all the rest of this §, is the real part of a holomorphic function. Though not 
strictly correct, this conjecture is to a large measure true (but is of no help in 
studying harmonic functions of more than two variables, which require very 
different methods). 
If, inspired by (2), we associate with H the function 


(24.5) g = Hi, — iH, = D,H —iD2H, 


we see that the Laplace equation means that g is holomorphic. If we could 
find a function f = P+7Q holomorphic in C and such that f’ = g, we would 
have H), = P,,, Hi, = Pj, and so H = P up to an additive constant. H would 
then be the real part of a holomorphic function in U as hoped. 


First assume, the simplest case, that H is harmonic on a disc D: |z| < R. 
The function g is then a power series, so has a primitive f(z) = )>a,z” in 
D (Chap. II, n° 19), of which H is, up to an additive constant, the real part, 
as we have just seen. Putting z = ru with |u| = 1, we then have 


2H(ru) = y Ayr” u” +5 GnT?ur = 
n>0 n>0 

a y anr” un” + y Gnr'u "= 
n>0 n>0 


= ) Cyrllu” 
Z 


with Cn = dp if n > 0, Ch = Gy = Ty if n < 0 and co = 2Re(ao). Since 
ri™lu” is equal to 2” for n > 0 and to Z!"! for n < 0, we finally obtain the 
following result: 


Theorem 20. Every harmonic function H on a disc |z| < R has a series 
expansion of the form 


(24.6) H(z) = So enr'™lu” = co + S— [en(a + ty)” + tala — ty)” 


n>0 
with 


43 The use of the letter U is traditional in physics. The mathematicians more often 
use u, which, in our case, might provoke confusion with the variable of integration 
on the unit disc T, while use of the letter U would provoke confusion with open 
sets, which we generally denote U. The use of the letter H does not present these 
risks, and, after all, is not absurd when treating harmonic functions ... 
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(24.7) cnr” = J Hleou-ram(u) 
for every r < R and every n € Z. 


There is no greater problem of convergence for the series (6) than for the 
power series of f: they converge normally in every disc of radius r < R. The 
general term of the second series (6) is a homogeneous polynomial of degree 
n in x,y, and clearly harmonic since it is the real part of c,z”. 


Corollary (“Theorem of the Mean”). Let H be a harmonic function in 
an open subset U of C. For everya € U and every r > 0 such that U contains 
the closed disc |\z —a| <r, one has 


(24.8) H(a) = [He +ru)dm(u). 


The argument is less easy — and the result less correct ...— in the case 
where H is given in an annulus C’.. Consider the Laurent series 57 b,z” of the 
function (5). After subtracting the term in 1/z it has, as we have seen at the 
end of n° 16, a pseudo primitive 


(24.9) iOS age 


such that 


(24.10) g(z) = f(z) + 6-1/2. 
First we shall show that the residue b_, is real *. 
Now, by (5), 


b1 = ¢ g(re(t))re(t)dt = 
= ¢ r [Hi (re(t)) — iH} (re(t))] (cos 2nt + isin Qnt)dt, 
whence 
Tia (bq f [H’,(re(t))r sin 2nt — H;,(re(t))r cos 2nt] dt. 


44 One can show that the harmonic functions in an open subset U of C are char- 
acterised by the fact that their value at the centre of any disc D C U is equal 
to their mean value over the boundary of D. It is not even necessary to assume 
differentiability. 

4 The function g = Ul, — iU{, is not an arbitrary holomorphic function; it must 
be possible to put its real and imaginary parts P and Q in the form P = Uj, 
Q = Uj, which is not always the case. 
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Since re(t) has coordinates rcos(27t) and rsin(2zt), the chain rule shows 
that 


“H(re(t)) = —2n [H},(re(t))r sin 2nt — Hj (re(t))r cos 2rt] ; 


consequently, Im (b_1) is, up to a factor —27, the variation between 0 and 1 
of the function t +> H(re(t)), zero by periodicity. The residue b_; is therefore 
real. 

Now let us put f = P+iQ with P and Q real. It follows that 


H’, — iH, = g(z) = PL - iP) + b-1/(x + ty) 
by (10), whence 


H, = Pi+b-y0/(a? +’), 
Hy = Pit+biy/(2?+y’) 


since b_, is real. The function R = H — P thus satisfies the relations 


Re — bya /(x? T y’), 
Ry = b-ay/(#’?+y’). 


Now the function 


1 
L(x, y) = log |z| = logr 5 log (2? + y) : 


not to be confused with the pretend Log of the complex number z, has partial 
derivatives 

L,=a/(@’t+y), Ly=y/(a? +’). 
The function R—b_,L thus has partial derivatives zero, so is constant, whence 
it follows that 
(24.11) H (x,y) = P(a,y) + 6_, logr + const. 


Since f(z) = P(x, y) + iQ(a, y) = 5 anz” we find 


1 ——— 
H(z) = blogrt+c 5 S- (Gnz" + Gn2™) = 
1 
= blogrtet+s So [an(a + iy)” + Gn(w — ty)”] 


with real constants b and c, summing over all nonzero n € Z. The expansion 


1 
(24.12) (ru) = blogr+e+ 5 oy (a,r” + tagt *) u” 
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is deduced from this, and yields the general form of the Fourier coefficients 
of the function H(re(t)). One may put all this in the form 


(24.13) H(re(t)) =b.logr+c+ S- [bn (17) cos Zant + Cn(r) sin 2rnt] 
n>1 


where the coefficients b,(7) and c,(r) are linear combinations with real co- 
efficients of r” and r~”. 

Exercise. By using the equation AH = 0, show directly that the Fourier 
series of ut» H(ru) has the form (13). (Argue as in n° 14). 

The fact that logr and the negative powers of r disappear when H is 
harmonic on a disc is due to the continuity of H on a neighbourhood of 
the origin: the Fourier coefficients of H(re(t)) must remain bounded when r 
tends to 0. 

One of the consequences of these calculations is that, in an annulus, a 
harmonic function is not always the real part of a holomorphic function. 
This is the case only if b = 0 in the expansion (13); direct calculation of the 
Fourier coefficients of H(ru) shows that 


(24.14) blogr+c= [ Hleodm(a) - pu (re?™™) dé, 


the mean value of H over the circle |z| = r. One may explain the appearance 
of the function log r by noting that it is the real part of the “function” Log z = 
log r + iarg z, which would be holomorphic for z 4 0 if one could forget the 
ambiguity inherent in the definition of the argument; this ambiguity being 
pure imaginary, the real part logr is, itself, a function in the strict sense — 
and it is harmonic. You can check this by calculating its Laplacian directly. 


25 — Limits of harmonic functions 


We have seen, in (24.8), that if a function H is harmonic on an open disc of 
radius R, its value at the centre of the latter is equal to its mean value over 
every concentric circle of radius r < R. We deduce that the maximum theorem 
(Theorem 11 of n° 15) and its corollary are valid for harmonic functions; the 
proofs are precisely the same. In particular, if a function is continuous on the 
closure K of a bounded domain G and is harmonic in G, and is zero on the 
boundary F of G, then it is identically zero since ||H||¢ = ||H||r. 

If a function H is harmonic on a disc |z| < R of radius R > 1 and if one 
puts f(u) = H(u) on T, the series expansion 


H(z) = YoeqrMu"™ with cari”! = i H(ru)u-"dm(u) 


of Theorem 20, valid for r < R, holds in particular for r = 1 and shows that 
Cn = f(n). Hence 
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H(z) = H(z) for |z| <1. 


In the case of an arbitrary radius R one can, for r < R, apply this result to 
the function z ++ H(rz), harmonic in the disc of radius R/r > 1. Whence 


H(ru)dm(u) (jz| < 1), 


or, on replacing z by z/r, 


r? — |z|? 


(25.1) H(z) = H(ru)dm(u) for |z| <r. 


¢lz—rul? 
This is the analogue for harmonic functions of Cauchy’s integral formula of 
n° 1; the existence of such a formula is scarcely surprising, since, on a disc, 
a harmonic function is the real part of a holomorphic function. 

Weierstrass’ theorem on uniformly convergent sequences of holomorphic 


functions applies also to harmonic functions, but needs several preliminaries. 
First, the formula (24.6), i-e. 


(25.2) H(x,y) =co+ > [en(a + ty)” + Gala — ty)"], 
n>0 


shows that a harmonic function is of class C°; again not very surprising, 
since, locally, it is the real part of an analytic function. Further, if one differ- 
entiates the general term of the series (2) with respect to x or y one multiplies 
the coefficients of order n by n or +in; up to a factor of modulus 1 this is 
equivalent to replacing the two power series in x+iy = z and x—ty = Z that 
appear in (24.6) by their derived series; the resulting series, and more gener- 
ally, those obtained by differentiating term-by-term ad libitum with respect 
to x and y, converge under exactly the same conditions as (2). Theorem 20 
of Chap. III, n° 17 would then show, if needed, that H is indefinitely dif- 
ferentiable and that its partial derivatives of arbitrary order are obtained 
by differentiating the series (2) term-by-term with respect to x and y. (The 
fact that we are dealing with functions of two variables is not important: the 
variable with respect to which one is not differentiating plays the rdéle of a 
constant). 
We deduce, after a small calculation, that the partial derivative 


D?DLH = H® 
is given by 


H®-D) (x,y) = 
=Lin(n—1)...(n—p— q+ 1) [ten(a + iy)” P49 + (—t) tena — ty)” 4] 


where one sums over n > p+ q. In particular, 
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H®-D (0,0) = (p+ q)! [i%cp4q + M4 prq] = 2(p + @)!Re (i%ep4q) - 
Since cyr” = f H(ru)u~"dm(u) for n > 0 we have 
(25.3) H®9(0,0) =(p+q)!r-?-4 y H(ru) [ifu~?~4 + (—i)4u?t 4] dm(u) 


and consequently 


(p+)! 
rPt+4 


sup |H(ru)|. 
jul=1 


(25.4) HO, 0)| <2 


From this we have the analogue of Weierstrass’ convergence theorem: 


Theorem 21. Let G an open set in C and (H,,) a sequence of harmonic 
functions in G which converges uniformly on every compact K C G to a 
limit function H. Then H is harmonic, and, for any p and q, the partial 
derivatives HE” converge uniformly on every compact subset of G to the 
partial derivative H'-® of H. 


We know thanks to Borel-Lebesgue (Chap. V, n° 6) that uniform conver- 
gence on every compact subset is a property of local character: to verify it 
for every compact K C G it is enough to show that, for every a € G, it holds 
on a closed disc of centre a. 

So choose an R > 0 such that the disc D: |z —a| < R is contained in G 
and put r = R/2. For every z such that |z—a| <r the closed disc of centre z 
and of radius r is contained in D. By (4) we have, for every harmonic function 
U in G, 

UPD(z) < 2(ptqg)!r-?~4 sup |U(z + ru); 
Jul=1 
but for |z — a] <r, all the points z+ ru are in the large disc D, whence 
trivially 
sup E+) <IUlp, 
ul= 


and consequently 

UP D(z) <Ap+aq)'!r-?-4.||U||p for |z—al <r. 
Now we apply the general result to the functions U = H,,, — Hy. Since the 
H,, converge uniformly on every compact subset of G, and in particular on 


D, we have ||H;, — Hn||p) < € for m and n large. The preceding inequality 
then shows that, for m and n large, we have 


BOO HN <a pagiee te 


at all the points of the disc |z —a| <r. 
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By Cauchy’s criterion, the partial derivatives HY?” converge uniformly 


in this disc and more generally, since a € G is arbitrary, on every compact 
K CG. It follows that the function H is C™ like the H,, and that the Heo 
converge to H':% for any p and q. This allows us to pass to the limit in the 
Laplace equation AH,, = 0, so that H is again harmonic, qed. 

If the domain G is bounded and if the H, are continuous on the closure 
K of G, the maximum theorem shows that, if the H,, converge uniformly on 
the boundary F = Kk — G of G, then they converge uniformly in G: 


||Hm = Anil = || ra Beall ae 


The preceding theorem applies to this case (but do not believe that the partial 
derivatives converge uniformly on all of G: they converge uniformly only on 
every compact subset of G). 


26 — The Dirichlet problem for a disc 


As we saw in preceding n°, if a function H is defined and harmonic on a disc of 
radius R > 1 and if one puts f(u) = H(u) for u € T, then H(z) = H(z) for 
|z| < 1. In this case Theorem 21 loses its interest: the series (24.6) converges 
normally in |z| <r for every r < R, so for r > 1, so that the passage to the 
limit when r < 1 tends to 1 results from the continuity of H in the closed 
disc |z| < 1, and even beyond. 

The situation becomes more interesting if, given an arbitrary real regu- 
lated function f on T, one associates with it the function 


(26.1) Hy(z) = So f(n)r'len(t) = > f(n)ri"u" = 
= peer) f(u)dm(u), 


defined a priori for |z| < 1. Since f is real we have f(—n) = f(n) and 
the function (1) is, up to the factor $, the real part of the power series 
ee f(n)z and so is harmonic; see also (22.15). 

If f is continuous, we know (Theorem 19) that H(z) tends to f(u) when 
z converges (not necessarily along a radius) to a u € T while remaining in 
the disc |z| < 1. This means that the function equal to Hy(z) for |z| < 1, 
and to f on T, is continuous in the closed disc |z| < 1. This was proved 
using the fact that the functions u +> P(zu) have the properties of a Dirac 
sequence when |z| < 1 tends to 1. This result furnished us a second proof of 
Weierstrass’ approximation theorem. 

Granted this, one could give a simpler proof of Theorem 19. Since the 
function u ++ P (zu~') is positive and has integral 1 on T, the formula (1) 
shows that |H(z)| < ||f|], whence the relation 
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(26.2) Zl < If 


between the uniform norms of f in T and of Hy on the open disc D: |z| < 1; 
this is just the maximum principle for the harmonic function Hy, modulo 
the fact that we do not yet know (or we are pretending not to know yet) 
that Hy is the restriction to the open disc of a continuous function on the 
closed disc. Now f is the uniform limit on T of a sequence of trigonometric 
polynomials f,,, which one may assume real if f is. For every trigonometric 
polynomial g the series Hy(z) = >> g(n)r!"!u” reduces to a finite sum, so is a 
continuous function of z = ru on all C. Denote by H,, the harmonic function 
corresponding to g = fn; by (2), we have ||H, — Ag||p < ||fp — fql| for any p 
and q; but since H, — H, is defined and continuous on the closed disc |z| < 1 
(and in fact on C), we have 


| Zp ~~ Ag|lp = Il fp y Fall 


where D is the closed disc |z| < 1. Consequently (Cauchy’s criterion), the 
H,, converge uniformly on D and their limit is continuous there. Now they 
converge to Hy in the open disc |z| < 1 since ||H¢ — An||y < ||f — fall by 
(2), and to f on T. Whence the result: 


Theorem 22. Let f be a continuous function on T. Then the function equal 
to 


(26.3) H;(z) = 


for |z| < 1 and to f on T is continuous on the closed disc |z| < 1 and 
harmonic in the open disc |z| <1. This is the only function possessing these 
properties. 


Uniqueness follows from the maximum theorem; see the beginning of the 
preceding n°. 

We have resolved a very particular case of the Dirichlet problem which 
can be stated roughly as follows: given a bounded domain G in C whose 
boundary is a not too savage curve, and, on the latter, a continuous function 
f, to construct a continuous function on the closure G of G, harmonic in G 
and equal to f on the boundary of G. Generalised to Euclidean spaces of 
arbitrary dimension, and to other differential operators than A, this is one 
of the fundamental problems of the theory of partial differential equations. 
Let us make clear that, even in the classical case of the Laplacian in an open 
subset of C, the case of the disc does not reflect the level of difficulty of the 
problem. 

We remark moreover that a harmonic function in the open disc |z| < 1 
in general has no reason to be continuable to a continuous function on the 
closed disc |z| < 1. The simplest counterexample is provided by the function 
P(z) = (1— |z|?) /|z — 1)? itself; it is harmonic in C — {1} but does not tend 
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to any limit when z tends to 1. A much more complicated case is obtained 
by starting from an arbitrary measure or even a distribution on T and 
considering the function 


= zl 


(26.4) yl) = f Fe atu = SP fln)rly 


its behaviour on a neighbourhood of the unit circle can be as strange as 
that of a holomorphic function. Again we do not obtain the most general 
harmonic functions > c,r!"!u" in this way, for one cannot have c, = fi(n) 
for a distribution « unless the coefficients c, are of slow increase (n° 10, 
Theorem 6), which need not happen, even if the series converges for r < 1. 
Counterexample and exercise: c, = exp (|n|!/?) for every n. 
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§ 6. From Fourier series to integrals 


In this §, the { sign denotes an integral extended over R, while the sign ¢ 
denotes an integral extended over an interval of length 1. Recall the notation 


e(z) =e", ey (2) = e(xy) 


for y real. 


27 — The Poisson summation formula 


Recall also that given a regulated and absolutely integrable function f on R 
one defines the Fourier transform of f by the formula 


(27.1) fly) = f flee ?reae = f etey)sa)ae. 
The integral converges since the exponential has modulus 1. 


Theorem 23. The Fourier transform of an absolutely integrable function is 
continuous and tends to 0 at infinity. 


Assume that y remains in a compact subset H of R; the function e(y) 
is continuous on R x Hand there exists a function p(x) [namely 1] such that 
|e(xy)| < p(x) for every y € H and f p(zx)|f(x)|dx < +00. It then remains 
to apply Theorem 22 of Chap. V, n° 23, substituting e(ay) for f(a,y) and f 
for yz. One could clearly argue directly: integrating over [—N, N] instead of 
R, one commits whatever y might be, an error < r if N is large enough; so 
it suffices — uniform limits of continuous functions — to prove the continuity 
of the integral over kK = [—N,N]. But since (2,y) + e(xy) is uniformly 
continuous on every compact subset of R?, the function x +> e(xy) converges 
to e(xb) uniformly on K when y tends to a limit 6; one may therefore pass 
to the limit in the integral over K. 

It is clear that f is bounded, with 


(27.2) fll = sup |f()| < f |fe)|dr = IIfll, - 


To show that f tends to 0 at infinity one proceeds from the simplest to the 
most general case. 
(i) If f is the characteristic function of a compact interval [a, }], 


. b bet en 2miny 
fy) = [Pde = 
Ps —2niy 


b 


a 


for y 4 0, whence the result in this case, hence also if f is a step function 
vanishing outside a compact interval. 
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(ii) If f is zero outside a compact interval K and integrable on K, then 
for every r > 0 there is a step function g zero outside K such that f | f(a) — 
g(x)|dx <r as shown by the very definition of an integral (Chap. V, n° 2). 


Then fu) - aly) <r for every y by (2); since |g(y)| < r for |y| large, we 


have |f(y)| < 2r for |y| large, whence again the result. 

(iii) In the general case, for every r > 0 there exists a compact interval 
K such that the contribution of R— K to the total integral of | f(x)| is <r; 
integrating over K in (1), one commits an error < r for every y, and since 
the integral over K tends to 0, we again find | f(y)| < 2r for |y| large, qed. 

As we have already seen a propos the function cot or the elliptic func- 
tions, the “Eisenstein method”, as Weil and Remmert call it, for constructing 
periodic functions on R consists of starting from non periodic functions f(z) 
and considering the series 


(27.3) F(z) =)  f(@zt+n), 


summing over Z. If the series converges unconditionally, i.e. absolutely, the 
result is incontestably periodic since changing x to «+1 is equivalent to the 
permutation n> n+ 1 in Z. One may then try to expand the result as a 
Fourier series. 

If one calculates formally, taking account of the periodicity of the expo- 
nentials, 


(27.4) Flp) = f el@)ae Sen) = fae So f(e+njee+n) 


II 
oo, 
S 
o 
+ 
23 
&P 
o 
+f. 
= 
Q. 
8 
II 


II 
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8 

II 
Ss) 
Y an 
S: 


where f is the Fourier transform of f. And since “every” periodic function is 
the sum of its Fourier series, we finally find the Poisson summation formula 
(though he never wrote it in this form) 


(27.5) So fe +n) => finen(a), 


in particular, for « = 0, 


(27.6) S~ Fn) = > fl). 


All this is formal calculation. The first problem is to justify the permuta- 
tion of the signs f{ and 5> performed to obtain (4). It is simplest to assume 
first that f is continuous and that the series 5+ f(a +n) converges normally 
on [0,1], in which case it is clear that it converges normally on every com- 
pact set, by periodicity; the presence of the factors e,(a) does not change 
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anything, since they are of modulus 1. If these conditions are satisfied then F 
is continuous and the term-by-term integration in (4) is justified (Chap. V, 
n° 4, Theorem 4). Subject to these hypotheses, the function f is moreover 
absolutely integrable on R, for the integral of | f(a)| over (—n,n), the nth par- 
tial sum of the series > [ | f(2+p)|dz, where one integrates over (0, 1), is, for 
every n, less than the total sum of this series; the convergence of | | f(x)|dzx 
follows from this (Chap. V, n° 22, Theorem 18). The formal calculation is 
therefore justified. It remains to justify the relation (5), which says that F 
is everywhere equal to the sum of its Fourier series; to do this it is enough 
to assume that the latter is absolutely convergent, i.e. that >> |f(p)| < +00; 
convergence for any x would suffice, by Fejér, but it is better, in this context, 
just to use a simple result: 


Theorem 24. Let f be a function defined and continuous on R such that 


(i) the series )* f(a +n) converges normally on every compact set, 
(ti) Di|F(n)| < +00. 


Then f is absolutely integrable on R and 


(27.7) 3 f(a@+n)= iS f(nyen(x) for every x ER. 


In practice, the convergence of the series }> f(z +) is almost always 
obtained by estimating f(a) for |a| large. Assume for example 


(27.8) f(z) =O(|x|~*) at infinity, with s > 1. 


The continuous function |z|*° f(a) being bounded for || large, i.e. outside a 
compact set, is in fact bounded on R, being bounded on every compact set; 
so likewise is f, so also (1+ |2|*) f(x), from which we have the estimate 


|f(x)| < M/(1 + |a|*) for every 2, 


with a constant M > 0. This shows that f is absolutely integrable on R 
(Chap. V, n° 22). If x remains in [0,1], then |” + n| varies between |n| and 
|n+1|, so is > |n| or |n+1| according to the sign of n. The series 1/ (1+ |n|*) 
being convergent since s > 1, normal convergence of S> f(x +n) follows. As 
to the convergence of S>|f(n)|, this is assured, as we shall see later, if f is 
sufficiently differentiable, as in the case of periodic functions. 

The real problem, in the practical use of the Poisson formula, or more 
generally of the Fourier transform, is that we have to calculate the Fourier 
transforms explicitly. Sometimes this is easy, as we shall see, but the crude 
method — calculating a primitive of the integrand — is not any use in general, 
because the primitive does not reduce to “elementary” functions. We have 
therefore to find methods for calculating the integral over R (and not over an 
arbitrary interval) without knowing the primitive; it was the greatest success 
of Cauchy’s residue calculus that it allowed this kind of calculation in cases 
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unknown up till then. Nothing better has been found since then; we have 
many formulae for Fourier transforms in terms of special functions, Euler’s 
I function for example, but they are almost always obtained by Cauchy’s 
method. 


Example 1. Choose the function 


f(t) = A(z +9) 


where z is anon real complex parameter and s an integer > 2, with for exam- 
ple Im(z) > 0. The preceding considerations show that the series )> f(t+ 7) 
is normally convergent on every compact set, but it remains to calculate the 
Fourier transform 


f(n) = [exv(-2int (z+t) "dt 


for n real, not necessarily an integer. To seek a primitive, for example by 
integrating by parts, would lead, more complication, to integrals of the type 
e’x"dx of Chap. V, n° 15, Example 2; they can be calculated immediately 
by hand for n an integer > 0 but, for n < 0, and especially for n = —2, 
they resist every attempt at explicit calculation (and not only because we 
are working here at too elementary a level); Euler’s gamma function would 
not have survived if one had been able to calculate a primitive of e~*2*. 
But, with his residue method, Cauchy succeeded in calculating in a general 
way the Fourier transform of a rational function p/q having no real pole and 
decreasing sufficiently fast at infinity, ie. such that d(q) > d(p) + 1. We may 
prove for example that, for s integer > 2 (convergence!), we have 


(—27i)*us—! exp(2miuz)/(s—1)! if u>0, 
[ exp(-2niut) (z+t) *dt= 


(27.9) 
on condition that Im(z) > 0. The summation formula > f(n) = 37 f(n) can 
be written, in this case, as 


0 ifu<0 


(27.10) S- : = ( — Dee eer for Im(z) > 0. 


Zee a oo n>0 
For s = 2, this is 


(27.11) So i/(z—ny? = —4n? S " ne?rin*:; 
N 


Z 


now we know (see (8.14) for example) that, for z not an integer, the left hand 
side is equal to ?/sin? rz; for Im(z) > 0 we have |e’*| < 1 and so 


§ 6. From Fourier series to integrals 361 


1/sin?z = —4/(e* — et) = —4e"*/(1- eri)? = 


—4e~** (1 + 2e7* + 3e¥* +...) 


using the power series of 1/(1—.x)?. In this way we see (11) as a consequence 
of the expansion of 1/sin? z as a series of rational fractions, and vice-versa. 
Starting from (11) one might obtain the general case (10) by differentiating 
with respect to z: the right hand side of (11) is a series of holomorphic 
functions, so that, to legitimate the differentiations, it suffices, thanks to 
Weierstrass, to show that the right hand side of (11) converges normally on 
every compact subset of the half plane Im(z) > 0; but on such a compact 
we have |e?™*"*| = e~?™¥ where y = Im(z) remains larger than a strictly 
positive number m, for the distance from a compact set to the boundary of 
an open set containing it is always > 0; since e~?"”™ < 1 normal convergence 
is then clear. 

In fact, the residue calculus (Chap. VIII, n° 10, (ii)) allows one to extend 
the formula (9) and so (10) — replacing (s — 1)! by I'(s) — to the case of a 
complex exponent s satisfying only the condition Re(s) > 1, so as to make 
the series (10) converge. 


Example 2. Now choose 


(27.12) f(x) = et! 


where ¢ is a parameter > 0, so that f is integrable on R. Then 
f(y) =f exp(-tha| ~ 2riey)de: 


on each of the intervals x < 0 and x > 0 one has to integrate a function of 
the form e°*, with c complex, and since such a function has primitive e*/c 
the calculation is immediate and yields the result: 


(27.13) f(y) = 2t/ (t? + 40?y?). 
Simple estimates show that Theorem 24 applies here, whence 
Ss elt = 2t > 1/ (t? +47?n?) , 


a formula strongly resembling the expansion of cotht as a series of rational 
fractions ... 


Exercise. Extend these calculations to the case where t is complex, with 
Re(t) > 0 (use Theorem 24 bis of Chap. V, n° 25). 
28 — Jacobi’s theta function 


Another more spectacular application of Theorem 24 depends on the calcu- 
lation of the Fourier transform of the function f(a) = exp(—ma?). We have 
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already met this in Chap. V, n° 25, Example 2, and we showed there, by 
differentiating under the f sign, that 


f(y) =cfy) where c= f(0) = [ exp(-a0?)de. 


If one shows that Theorem 24 and in particular (27.6) applies to f, then one 
has c>> f(n) = ¢ f(n) and so c= 1 since the f(n) are all > 0. 

Now the function exp(—7x”) decreases at infinity faster than every neg- 
ative power of x, so satisfies the condition (27.8). Since, for the same reason, 
the series }> f (n) converges absolutely, Theorem 24 applies. Moreover, it 
yields the identity 


(28.1) ye exp [-1(a +n)?] = S- exp(—7n")en (2), 


valid for every x ER. 
One may generalise, replacing the function f(t) = exp(—7t?) by 


(28.2) f(t,z) =e" 
where z = x + iy is a complex parameter. Then 
f(t, 2)| = exp (—myt”) = qh where q =e"; 


this expression is > 1 if y <0, whence 5+ |f(n, z)| has no chance of converg- 
ing; if on the other hand, y > 0, then gq < 1 so that, for z given, | f(t, z)| 
tends to 0 at infinity more rapidly than t~% for any N > 0 (Chap. IV, n° 5), 
whence normal convergence on every compact set of )> f(t+n, z). It remains 
to calculate the Fourier transform 


(28.3) f(u,z) = [ox (wizt? — 2niut) dt = [ote z)dt. 


First assume z = iy pure imaginary, so that izt? = —yt?. The change of 
variable t+ y—!/?t gives 


f(u, iy) = [exw (Gum = 2riuy- 4) yo dt, 
which leads us to the Fourier transform of exp(—7t?) for the value uy 1/2: 
thus 


(28.4) f(u,iy) = y~/? exp (—ru?/y) for y >0. 


In the general case, since the function g(t, z) under the f sign in (3) is, 
for given t, holomorphic in the half plane U : Im(z) > 0; f(u,z) is probably 


holomorphic too (f = f with a hat over it, see (28.3)). To confirm this, we 
very luckily find in Chap. V, n° 25, a Theorem 24 bis which presupposes 
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the following hypotheses: (i) the integral (3) converges absolutely: obvious; 
(ii) the complex derivative g'(t, z) = 7it?g(t, z) with respect to z is a contin- 
uous function of (t, z): obvious; (iii) for every compact H C U there exists 
an integrable function py(t) on R such that |g’(t,z)| < px(t) for anyt ER 
and z € H: this demands a proof. But since the compact set H is contained 
in the open half plane Im(z) > 0 there exists (see above) a number m > 0 
such that Im(z) > m for every z € H; then 


\g'(t, 2)| = wE2lg(t, 2)| = mt? exp (—myt2) < at? exp (—rmt?) = put), 


an integrable function on R because at infinity the function exp(—mmt?) is 
O(t~?%) for any N > 0; we would be happy with much less. 

The function (3) is therefore holomorphic in the half plane U : Im(z) > 0. 
Since we know how to calculate it for z = ty pure imaginary we will obtain 
the general case by constructing on Im(z) > 0 the one and only (principle 
of analytic continuation) holomorphic function which, on the imaginary axis, 
reduces to (4). Now 


(28.5) f(u,z) = (z/i) 1? exp (—miu?/z) 


for z pure imaginary, agreeing that (z/i)-V ? is positive real for z = iy. 
The factor exp (—7iu? /z) being holomorphic in C*, we have only to find 
a holomorphic function in the half plane U, which, for z = iy, reduces to 
y'/?; but, up to a few details, this is what we did at the end of n° 16. For 
z €U, the ratio z/i = ¢ indeed lies in the half plane Re(¢) > 0 contained in 
C — R_; in the latter one may define a uniform, i.e. holomorphic, branch of 
the “multiform function” ¢~!/? by putting 


COM? = |p ese) with | arg(¢)| <7. 


Since the point z = i corresponds to ¢ = 1 where arg(¢) = 0, the holomorphic 
function we seek is therefore given by the formula 


(28.6) — (z/i) /? = Jaf e- 2 8/) with | arg(z/i)| <7 


in the half plane Im(z) > 0 in question (and even in C with the negative 
imaginary half-axis removed). This is equivalent to choosing 


arg(z/i) = arg(z) — 7/2 with 0 < arg(z) < 7, 


a very natural choice: for arg(¢) = 7/2+2k7 and the translation —7/2 moves 
the interval ]0, 7[ to the interval | — 7/2, 7/2[. 
This point clarified, the Poisson summation formula gives us 


(28.7) iS exp [miz(t +n)?] = (z/i)-1/? os exp (—min?/z + 2rint) . 


Introducing the Jacobi function 
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(28.8) A(z) = S- exp(min?z), Im(z) > 0 
(or, for z pure imaginary, the Poisson function), (7) reduces, for t = 0, to 
(28.9) 6(—1/2) = (z/i)/6(2); 
note, a detail to remember, that 
Im(z) > 0 => Im(-1/z) > 0 


These formulae are some of the “strange identities” of Chap. IV. On 
replacing z by —1/z we may rewrite (7) in the form 


(28.10) S “exp ( min?z + 2nint) = (z/i) M2” exp [ —ni(t + n)?/z] . 


Now, putting g = exp(ziz), whence |g| = exp(—7y) < 1, and*® x = e(t), the 
left hand side is just the series 


S- gr x” 


for which we wrote, in Chap. IV, eqn. (20.14), the curious expansion as an 
infinite product. With this notation, similarly 


(28.11) =Soq™ =14+2g4+ 294+ 299+... (ga e™). 


The series (8) is already, more or less, to be found in Fourier’s Théorie 
analytique de la chaleur, though he set little importance on it. The relation 
(9) was published by Poisson in 1823. Jacobi and Abel studied series of the 
type (7) systematically, from 1825 on, by purely algebraic methods; Abel died 
too early to exploit them, but Jacobi drew such a mass of formulae and of 
results, mainly in theory of the elliptic functions, that his name has remained 
attached to them. 

The connection to the theory of heat propagation is immediate. In an 
annulus the evolution of the temperature is controlled by the partial differ- 
ential equation f/ = f”, with a numerical coefficient > 0 which depends on 
the physical constants; t is the time and « the polar angle. Fourier’s idea was 
to seek solutions of the form f(x,t) = g(x)h(t), whence g(x)h'(t) = g(x)h(t) 
and consequently h’(t) = Ah(t), g(a) = Ag(x) where X is a constant. But g 
must be of period 27, whence g(x) = acosnz+bsin nz and \ = —n?, so that 
h(t) = c.exp (—n?t), where a, b and c are constants (Fourier eliminated the 
functions exp(+n?t) for obvious physical reasons). Fourier then postulated 
that the general solution of his equation is a sum 


(28.12) ft) = S- exp (—n7t) (an cos nx + by sin nz) 


46 This x is not the real part of z; the notation here has been chosen to fit with 
that of Chap. IV, n° 20. 
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of “decomposable” functions of this type, a method applicable to all sorts 
of other problems, mainly in classical or quantum physics. One may then 
calculate f(a,t) if one can expand the initial state f(v,0) of the system in a 
series of the form 


f(x,0) = Ss Gp, cosnx + b, sin nx; 


it was this problem which led Fourier to expand every periodic function as a 
trigonometric series. 
The function 


(28.13) O(a,t) = S “exp (—1n*t + 2rinz) = 
= 1+25 exp ( —7N 6) cos 27nzx 


satisfies the equation 
o  = 4076) 


and so enters into the framework studied by Fourier; in fact, we now know 
that it dominates the problem, for if one writes (12) in the form 


= So cn exp ( —mn*t) en (2x), 


summing over Z, an immediate*’ calculation shows that 


(28.14) f(a,t) = f a —y,t)f(y)dy fort >0, 


where f(y) = f(y, 0) is the temperature distribution at the initial instant. 
For the Jacobi function the initial data 


= S" exp(2minz) = S- en (x) 


is not a true function; it is the Dirac measure at x = 0. Physically, the initial 
temperature is +oo at « = 0 and 0 elsewhere. This might not have made 
Dirac recoil, but Fourier did not go so far as to envisage this version of the 
Big Bang corresponding to what would happen if one set fire to an artillery 
piece whose barrel, curved, was a perfect torus. 


29 — Fundamental formulae for the Fourier transform 


The Poisson summation formula allows us to pass very rapidly from the the- 
ory of Fourier series to that of Fourier integrals. The proofs which follow 
are taken from N. Vilenkin, Special functions and representations of groups 


47 Use the general formula to calculate the Fourier coefficients of a convolution 
product. 
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(Moscow, 1965), and attributed to I. M. Gel’fand, 1960, the Soviet mathe- 
matician who has invented many more original ideas since the 1930s than 
one can give him credit for. Igor Sakharov relates in his Mémoires that, dur- 
ing 1950s, Gel’fand headed a team of mathematicians at the university of 
Moscow responsible for the calculations needed for the Soviet thermonuclear 
programme. Long forbidden to travel outside the national territory, he is now 
professor at Rutgers University, New Jersey, and travels often... . Many 
other proofs of Fourier and Cauchy are now known, but there is little likeli- 
hood that this will ever be improved because of the total absence of explicit 
calculations**; this is the great difference from all the classical proofs. 
One starts from a function f satisfying the following hypotheses: 


(H 1) f ts continuous, 
(H 2) the series > f(a+n) converges normally on every compact set; 


it follows, as we have seen, that f is bounded and absolutely integrable on R. 
Let us put 


(29.1) fy(x) = F(aje(yx) = f(x)ey(@). 

The exponential factors being of modulus 1, the series )> f,(x +n) converges 
normally on every compact set for every y; on the other hand, the Fourier 
transform of the function fy is 


(29.2) Alt) = f #e)elyeje(—tw)ae = f(t —y). 


If we now assume that . 
(29.3) S-lf(n—y)| < +00 


the Poisson summation formula applies to f, and shows that 
(29.4) SS) f(a@tnje,(x +n) = > f(n-y)e nx). 


Now >> f(a+nje,(z+n) = e,(x) > f(a+nyje,(n); since, generally, e,(x) = 
e,(y), (4) then leads to 


(29.5) do fe + nlen(y) = D0 fn — yex(n — y). 


For x given, let us write F,(y) for the common value of the two sides. By 
(H 2), the left hand side of (5) is an absolutely convergent Fourier series in y. 


48 In fact, this is part of the theory of topological groups: one has a locally compact 
commutative group G = R, a discrete subgroup I’ = Z such that the quotient 
group G/I = K = T is compact, and is concerned to pass from harmonic analysis 
on K (Fourier series) to harmonic analysis on G (Fourier integrals). This is what 
André Weil did in 1940, in the general case, in a book we have already cited and 
which may have inspired Gel’fand, who, at the same time, invented the subject 
in Moscow with D. A. Raikov, using functional analytic methods not imposing 
any hypothesis on the structure of G. 
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The coefficients f(a+ 1) are therefore found by integration (Chap. V, n° 5). 
Whence, for n = 0, 


1 1 
fla) =f Paddy = ftv fln— esta. 
Let us now strengthen the hypothesis (3) and suppose 


(H 3) the series \~ f(y +n) converges normally on every compact set; 


then so does the series to be integrated, whence 
fle) = YD f fla yex(n—y)dy = 
S- ¢ f(nt+y)er(n+ y)dy, 


I 


which is simply the Fourier inversion formula 


(296) fle) = f fletenay= f ferrdy 
where one integrates over R. We may write 


(29.6’) f(s) = f(-2). 


Let us now apply the Parseval-Bessel equality to (5), considered as a 
Fourier series in y. We find the relation 


P 2 

f tu] foe nerion 
r 2 

= paylD fln- yer 
since e2""*Y, of modulus 1, is a factor of the series on the right hand side; the 


series on the left hand side is convergent by Parseval-Bessel, but since f is 
bounded, 


(29.7) So lf(et ny? 


I 


(29.8) [f(a tn)? < lf I f(@ +n); 


so the series (7) converges normally on every compact set, by (H 2), and its 
sum is continuous. 

Now let us integrate with respect to x over (0,1); we find f |f(a)|? da, 
a convergent integral by (8) and the fact that [| f(2)|dz converges. On the 
right hand side the function > f(n — y)en(x) is continuous in (y,x), for if a 
convergent series )> v(n) dominates the series S> | f(n—y)| on I = [0, 1], then 
it dominates the series in question on J x R. We may therefore interchange 
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the order of integration (Chap. V, n° 9, Theorem 10), to obtain the double 


integral ; ‘ 
[en [ae |X Fee apento) 


But the function to be integrated is, for y given, the square of an absolutely 
convergent Fourier series in x. Its integral may therefore be calculated using 
Parseval-Bessel (Chap. V, n° 5 suffices), ie. is equal to >| f(n — y)|?. On 
integrating with respect to y and comparing with the preceding result we 
finally obtain the Plancherel formula 


(29.9) ‘| If(«)Pae = i fly) Pay: 


this is the analogue of Parseval-Bessel for Fourier integrals. In conclusion: 


2 


Theorem 25. Let f be a continuous and absolutely integrable function on R 
such that the series > f(x +n) and > f(y+n) converge normally on every 
compact set. Then 


t= f fwetavav, — f \t@Pax= | lfwPay, 


the three integrals over R being absolutely convergent. 


More generally, if two functions f and g satisfy the hypotheses of the 
theorem, then 


(29.10) J float glade = | fu)at I(y)dy; 


one passes from the case f = g to the general case as we did for Fourier 
series, i.e. by applying the Plancherel formula to the functions f +g, f — 4g, 


f+ig and f — ig. 


Example. In view of Example 2 of n° 27 we have 


Qte2Mry 7 
/ Baap =e for every ¢> 0. 


This formula essentially says no more than 


i eV dy = were 
x2 +1 


Cauchy’s residue calculus would give this formula directly. To try to establish 
it “without knowing anything” is hopeless, except of course for x = 0, the 
only case where the primitive can be calculated. 


Exercise (another proof of the inversion formula). Let f be a continuous 
function on R satisfying 
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f(z)=O(1/le\*), f(y) = O (1/lyl’) 


at infinity, with constants a,b > 1, so that f and f are absolutely integrable. 
(i) Show that the Poisson summation formula applies to f [use Theorem 2 
of n° 6]. (ii) Show that, for every T > 0, 


Ss" fl (x +nT) = >Hi n/T)e 2ninx/T 


(iii) Show that, when T — +00, the left hand side tends to f(x) and the 
other to SfY e(xy)dy. 


30 — Extensions of the inversion formula 


One may also write (29.10) in the often convenient form 


(30.1) [toad =f fadouau: 


it suffices, in (29.10), to replace g(x) by g(x) and to note that then g(y) is 
replaced by g(y). The relation (1) is in fact directly obvious if one calculates 
formally: 


(30.2) ) f Flea a)dx = 


= / f(a)dx i g(y)e(xy)dy = / / f(x) g(ye(wy)dady = 
= [way f soeteajar = J fated. 


But one has to justify the interchange of the integrations. In the Lebesgue 
theory it is enough to assume that h(x, y) = f(x)g(y)e(xy) is integrable on 
R?, ie. that f and g are integrable, since e(ay) is continuous and bounded; 
one then applies Fubini’s theorem (the real one ...) to the function obtained. 

In the Riemann theory, there is a more restricted, but nevertheless useful 
result: 


Lemma. Let f and g be two absolutely integrable regulated functions; then 


J f(@)g(a)dx = f f(ya(y)dy 


First assume f and g are zero outside compact intervals K and 4H, 
and consider on K and H the measures dyi(x) = f(x)dx, dv(y) = g(y)dy 
(Chap. V, n° 30, Example 1). Since the function e(ay) is continuous on 
K x H we have 


fate) [ eeviaety = favty) | eeaydnx) 
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(Chap. V, n° 30, Theorem 30). By definition of and v, this relation justifies 
the formal calculation (2). The lemma is therefore true under the hypotheses 
just formulated. 

In the general case, let us write f, and gp for the functions equal to f 
and g on [—n,n] and zero elsewhere, whence 


(30.3) / fa()Gala)de = : Filvonty)ay 


It all reduces to showing that one may pass to the limit under the integra- 
tion sign. Let us do this for the left hand side. It is enough to show that 
\lfnGn — fGl|1 tends to 0, since, generally, |f f| < [|f|. Now omitting the 
variable x, 


lfnGn — £9] fn — fl -19nl + [fl 9n — 9 
fn — FL Mell + 1F1-MGn — gil 


fn — fl-Mgnll, + 1F1- Ign — alls 


IN IA IA 


since 
ieee | i etenalu)ay < [loll 


for every absolutely integrable function on R. Whence, integrating over R, 


(30.4) lfnGn — Fall, < \lfn = aly . \|9n Il Ae \| fla. ln —gll, 
But 
a—f\lL= x)\dax 
Ifn — tI fecalhl ) 


tends to 0 since f is absolutely integrable, likewise || gn — g||,; the factor || f||1 
is independent of n and the factor ||g,,||, tends to ||g||,. The right hand side 
of (4) therefore tends to 0, qed. 


For example let us choose for g the function e~"!*! and for f an absolutely 
integrable regulated function. By Example 2 of n° 27 we find 


(30.5) /—== = aa Daa (4) dae = fet faray every t > 0. 


t? + An? 
If we put 
(30.6) u(x) = 2/(1+ 4n?2), Un(x) = nu(nz), 


the relation (5) can be written, for t = 1/n, in the form 


(30.7) [rune flea = femnfayay 
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The function u is continuous (and even C®), positive, and its total integral 
is equal to 1, as an elementary calculation shows. Dirac’s lemma of Chap. V, 
n° 27, the version of Example 1, then shows that, if f is continuous at the 
origin and bounded on R, the left hand side of (7) tends to f(0). On the 
right hand side the exponential converges to 1 uniformly on every compact set 
while remaining < 1; if the function - is absolutely integrable, the right hand 
side of (7) then tends to the integral of the latter (dominated convergence), 
whence, in the limit, 


(30.8) f(0) = ‘i fu)ay, 


i.e. Fourier’s inversion formula for x = 0. 

In fact, it is not necessary to assume f bounded. In Chap. V, n° 27, 
this ear was used only to show that, for every 6 > 0, the integral 
J f(x)un(x)dzx extended over the set |x| > 6, tends to 0. Now it is clear that 
Ja 

|a| > 6 > 0 => |un(x)| < 1/2nd?2?, 


so that the function x +> u,(a) converges uniformly to 0 on |2| > 46; since, 
here, f is assumed absolutely integrable, we have 


lim f(x)un(x)dx = 0 
|z|>06 


even if f is not bounded. 
To obtain the inversion formula at an arbitrary point a € R one replaces 
xr f(x) by c+ f(a+a). The Fourier transform becomes 


[ fet aeleyjae = f s(v)elay=ay)ae = flye(ay) 


and by applying (8) to the new function one obviously obtains Fourier’s 
inversion formula at the point a if f is continuous at this point. Consequently: 


Theorem 26. Let f be a continuous absolutely integrable function on R. 
Suppose that f is absolutely integrable. Then 


(30.9) f(z) = J freceniay for every x ER. 


Note that the proof uses only the following facts: (i) formula (2), which we 
established using a “poor man’s Fubini” without using Theorem 25, (ii) the 
perfectly elementary calculation of the Fourier transform of e~'!*!, whence 
(5) directly, (iii) the fact, also totally elementary, that the functions x > 
2t/ (t? + 47x”) form a Dirac sequence when t tends to 0. 

The reader will doubtless observe that the functions of Theorem 25 sat- 
isfy the hypotheses of Theorem 26. So why state a useless Theorem 25 when 
Theorem 26 provides us the inversion formula under more general hypotheses 
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and without passing through Theorem 25? The reason is simple: besides that 
Theorem 25 also gives us the Plancherel formula, its proof does not use, as 
we have said, any explicit calculation. 


When f is not continuous at 0 the relation un(x) = un(—x) allows one 
to argue as we did a propos Dirichlet’s theorem on Fourier series: the limit is 
$[f(0+) + f(0—)]. The formula we obtained resembles Dirichlet’s, so we may 
conjecture that it is valid under more general hypotheses than integrability 
of f . This is an interesting exercise, though its usefulness to us is very small. 

If one is inspired by the case of Fourier series, one replaces, in the in- 
version formula for x = 0, the “total” integral, not absolutely convergent, 
substituting for ie its “partial” integrals 


N N 
(30.10) ég(0) = / Sway = i a / f(te(ytat 


and one interchanges the integration signs; the lemma established above au- 
thorises us to do this if f is regulated and absolutely integrable: take for g(y) 
the characteristic function of the interval (—N, N). Then 


e(Nt) — e(—Nt) 
(30.11) sw n= fro oa ) ss [re )Kn(t 


The function Ky(t) = sin(27Nt)/at is not absolutely integrable, but its 
integral over R is convergent since 1/t is monotone and tends to 0 at infinity 
(Chap. V, n° 24, Theorem 23; there is no problem at t = 0 since the function 
is continuous there). Let us put 


(30.12) | Ko(oae = 


it will emerge that c = 5, but we do not know this a priori. Since the function 
Ky is even, we find, as in the case of Fourier series, that 


i MO~ 10+) 


sin(27Nt)dt + 
+f Fl) = FO“) sin(2aNt)dt. 


Assume now that the function f has right and left derivatives at t = 0 and 
put 

[f(t) — f(O+)]/mt for t>0, 
(30.13) gt)=< ? for t=0, 

(f(t) — f(0—)| /nt for t<0, 


the sign ? indicating that the value attributed to g at 0 is unimportant. We 
obtain a regulated function in R and 
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(30.14) sn (0) — e[f(0+) + f(0—-)] = 


= / g(t) sin(2nN¢)dt = [g(N) — g(—N)] /2i. 


It all amounts to showing that g(y) tends to 0 as |y| increases indefinitely. 

This would be obvious if g were absolutely integrable (n° 27, Theorem 23), 
but we are not in this case. Theorem 23 of Chap. V, n° 24 relative to integrals 
of the form f f(x) sin(xy)dz, where f is monotone and tends to 0 at infinity, 
will resolve the problem. 

We may decompose the integral in (14) into three parts relative to the 
intervals (—oco, —1), (—1,1) and (1, +00). The integral extended over (—1, 1) 
is the Fourier transform of a regulated function of compact support, so tends 
to 0 at infinity (Theorem 23). The integral extended over (1,+00) can be 
written 


} es fo sin(2aNt)dt — f(0+) / i sin(2aNt)dt/t; 


this calculation is legitimate because f(t) and a fortiori f(t)/t are ab- 
solutely integrable, while the second integral converges; in fact, Theorem 23 
of Chap. V, n° 24 even shows that it tends to 0. Likewise for the first, as the 
Fourier transform of an absolutely integrable function. One argues similarly 
for the interval (—o0, 1). 

The integral in (14) thus tends to 0 and one obtains the following result: 


Theorem 27. Let f be an absolutely integrable regulated function. Then 


N 
(30.15) dim, f flweterddy = SIfle+) + Fe] 


at every point where f has right and left derivatives. 


And why has the unknown constant c transformed itself surreptitiously 
into $ ? Because, if one applies the formula to a sufficiently accommodating 
function, one already knows (Theorem 25) that the right hand side of (15) is 
equal to f(a). So the constant c has no choice ... 

This small auxiliary result can be rewritten as 


J snemnveyat/t = 


or, by an obvious change of variable, 
(30.16) i sin(t)dt/t = 7, 


a famous formula of Dirichlet’s. You are advised not to try to establish this 
by looking for a primitive of sin(¢) /t. 
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31 — The Fourier transform and differentiation 


As we have seen in Chap. V, n° 25, Example 1, if a regulated function f on 
R satisfies*® 


(31.1) [l@lae < +00, [ieflolae < +00, 


its Fourier transform is differentiable and 
(31.2) fily) = -2ni f xf (e)elan)ae, 


the Fourier transform of —27ix f(z). 

To formulate this result in a concentrated way, it is helpful to introduce 
“operators” transforming the (or certain) functions on R into other functions 
on R: 

(i) the operator M: f+ Mf of multiplication by —27ia; 
(ii) the differentiation operator D: fH Df; 
(iii) the Fourier transform operator F: f > Ff = f. 


Then formula (2) can be written as 
(31.2’) DFf=FMf or DoF=FoM, 


the symbol o as always denoting the composition of maps. One must remain 
aware of the fact that (2’) assumes Mf absolutely integrable. 

One may iterate the argument so long as the functions M* f are integrable. 
Since the Fourier transform F' exchanges M and D, one clearly finds the 
formula 


(31.2”) DF f= FM*f 
if M* f(x) = (—2rizx)* f(x) is absolutely integrable. Whence a first result: 


Lemma 1. Let f be a regulated function such that f |x? f(x)|dx < +00. 
Then oi is of class C? and 


(31.3) D* f(y) = (erie) fe)elu)ae for every k < p. 


Limit case: 
f is C™ if [lertelae <+oo for every p. 


49 The second condition implies the first since f is integrable on every compact set 
and |f(x)| < |x f(x)| for |x| large. 
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Now suppose that f is of class C! (or, more generally, a primitive of a 
regulated function) and that f[ | f’()|dx < oo. Integrating by parts, one has, 


for y £0, 


—27i yi 
e€ 2nrixy 


es T 
(1a) f seePrrae = s(@) tae ff evar, 


—2riy 


—-T 


Since f’ is integrable on R the function 


f(z) = f(0) + | "fi (t)dt 


tends to a limit as x tends to +00 or —oo. This limit is zero since otherwise 
the integral [ |f(t)|dt would be clearly divergent. So we see that in (4), the 
integrated part tends to 0 when T' — +o0, and there remains 


(31.5) f’ (y) = 2miyf(y), 


which we may write in the form 
(31.5’) FDf=—-MFf or FoD=—-MoF. 


If f is of class C?, and if all its derivatives are absolutely integrable, we may 
apply the calculation p times to obtain 


(31.6) f(y) = (2miy)? fly), ie. FDP f = (-1)"MP Ff. 
Now the left hand side tends to 0 at infinity (Theorem 23); consequently: 


Lemma 2. If f is of class C? and if all its derivatives are absolutely inte- 
grable, then 


(31.7) fly) =o (y-?) when |y| —> +00. 


In other words: the Fourier transform decreases at least as rapidly as the 
function f has integrable derivatives. 


The ideal case is that where f is indefinitely differentiable, with deriva- 
tives satisfying 


(31.8) f(x) =O (a~*) for any p and q; 


one then says (L. Schwartz) that f is indefinitely differentiable with rapid 
decrease; the set of these functions is denoted S(R) or simply S. We must 
not forget that the condition of decrease at infinity applies not only to f, but 
to all its derivatives. If f is in S, so also is the function x? f(x) for any 
p and q, for on multiplying a derivative of arbitrary order of x? f(x) by a 
power of x we obtain a linear combination of a finite number of functions of 
the form «* f() (x), which are O («~%) for any N by (8) for p= h,q=k+N. 
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Theorem 28. The Fourier transform maps S bijectively onto S. 


It is clear that if f € S one may apply Lemma 1 to it for any p, so 
that f is C®. Since 2? f(x) is also in S one may apply Lemma 2 for any p; 
consequently, f (y) =O (y~ my) for any N. But the derivatives of f are, up to 
constants factors, the Fourier transforms of the functions x? f(a), which are 
again in S. They too are O (y~\) at infinity for any N. 


Consequently, f € S implies f € S. But since f(z) = f(—2), the condition 
f € S implies conversely that f € S. The map is therefore bijective, qed. 

An immediate corollary is that the Poisson summation formula, Fourier’s 
inversion formula and Plancherel’s formula apply to every f € S. 

Another important, and easy to establish, result in S is the formula 


(31.9) feo=fa 


which gives the Fourier transform of a convolution product 
(31.10) fx g(e) = 9% F(a) = f fle v)al)dy, 


an analogue to the formula (4.10) for periodic functions. Calculating formally: 


fae) =f soetemar f owetay = f fee Fo)sa)gu)aedy = 
[way [eer are ae = f ody | eee - yar = 
[eae f ote -v)ay = fF o(c)ex(@ae. 


whence the result. The interchange of the repeated integrals is justified by 
Theorem 25 of Chap. V, n° 26 since the exponential is of modulus 1 and the 
functions f and g are absolutely integrable and bounded in R, so that the 
function e,(a# + y)f(x)g(y) is, up to constants, dominated either by |f(x)|, 
or by |g(y)|- 

The relation (9) is in fact valid under much wider hypotheses — it would 
be enough for f and g to regulated and absolutely integrable, the case where 
f and g are continuous of compact support being particularly obvious -, 
but since Lebesgue’s integration theory yields it very easily in an even more 
general case, it is better to wait to deal with this. 

Since the ordinary product of two functions of S is again in S (obvious!), 
(9) and Theorem 28 show that 


I 


I 


l 


(fES) & GgES)=freGEeS. 


Exercise: prove this directly, starting from (10). 
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Since we have already given three different proofs of the inversion for- 
mula (Theorem 25, Exercise of n° 29 and Theorem 26), we may as well 
give a fourth, based on the idea, dear to the physicists, that a “continuous 
spectrum” is the limit case of a “discrete spectrum” whose “lines” approach 
each other more and more closely, as Cavalieri, with his “indivisibles”, would 
have had no trouble understanding. The method rests on a simple formal 
calculation, but one has to justify it, which is less easy. 

One starts with a regulated function f defined on R and, for every T > 0, 
considers the function fr of period T satisfying 


(31.11) fr(a)=f(«) for -T/2<a<T/2. 


“Clearly” we have a Fourier series expansion 
y Pp 


(31.12) = S°a,(T)en(x/T) 


with 


T/2 T/2 
(31.13) a,(T) = z/ fr(x)en(—2/T) dx = z/of (—nax/T)dx 


-T/2 age 


For T large the last integral is “almost” equal to the integral extended over 
all R, ie. to f(n/T), whence “manifestly”, for = 0 let us say, the formula 


(31.14) f(0) © =y. f(n/T). 


The right hand side is “obviously” the Riemann sum one would obtain in cal- 
culating [ f(y) a using the subdivision of R by the abscissae n/T. Whence, 
in the limit, f(0) = f fly )dy and, by translation, the inversion formula at 
an arbitrary ae The same calculation also yields the Plancherel formula. 
The Parseval-Bessel theorem applied to the Fourier series of fr shows that 


T/2 P 
(3115) of [fr(e)Pax =o lan(T)P & oy Oo Mfln/T)P 
T J_r/2 T 


the sign & signifying that the right hand side is “almost” equal to the third. 
In the first term one can replace fr by f, whence an integral which tends 
to { |f(x)|?da; if one multiplies the third term by T to eliminate the factor 
1/T from the first, one finds again a Riemann sum which “clearly” tends to 
I lf) Pay, “cata” 

This is all very well, but there are several gaps to fill in, which explains 
why some textbooks for “users” confine themselves to the formal calculation 
and to the traditional mathematical variant of the argument from authority, 
accompanied by recourse to physical intuition. 

If one restricts oneself to examining what happens for « = 0, which is no 
restriction in generality, the relation (12) assumes that fr, i.e. f, is continuous 
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at this point since otherwise the inversion formula has little chance of being 
correct. It also assumes, more seriously, that the Fourier series converges. 
Simplest is to assume fr of class C! on R, so imposing the same hypothesis on 
f, but even in this case definition (11) shows that fr has every chance of being 
discontinuous at the points +T/2. A convenient procedure for eliminating the 
difficulty is to assume f of compact support since, for T' large enough, f is 
then zero on a neighbourhood of the end-points of the interval (11). If this is 
the case, the second integral is in fact extended over all R for T large, whence 
an(T) = f(n/T)/T directly and the formula (14) follows. 

One then has to pass from the series > f(n/T)/T to the integral of f. 
This assumes at least that the latter converges. Since f is of compact support, 
Lemma 2 above shows that this is the case if f is C?, since then f(y) = 
O(1/y?) at infinity. This done, one may consider the series 5~ f(n/T)/T as 
the integral over R of the function yr equal to f(n/T) between (n — 1)/T 
and n/T; as T increases yr converges simply (and even uniformly on every 


compact set) to f since x is continuous. Since we have the global estimate 
|f(y)| < M/(1+ y?) = p(y), the same estimate applies to yr, and since the 
positive function p is integrable on R, the dominated convergence theorem 
shows that the integral of yr tends to that of f; one may therefore, in (14), 
replace the series by [ f(y)dy, whence the inversion formula. The calculation 
leading to the Plancherel formula is justified by analogous arguments. 

The necessary arguments become noticeably more difficult if one aban- 
dons the hypothesis that f(a) is zero for |a| large. Even under the much too 
strong hypothesis that f € S the difficulty due to the fact that fr may have 
isolated discontinuities does not disappear: the Fourier series of fr does not 
converge absolutely and if one wants to pass to (14) one has to evaluate pre- 
cisely the difference between a,(T) and f(n/T)/T, which is easy since f € S, 
and then pass to the limit term-by-term in the series (12). 


32 — Tempered distributions 


When Schwartz invented his theory of distributions (Chap. V, n° 34) he im- 
mediately asked the following question: can one define the Fourier transform 
of a distribution T on R as on T? Now a distribution is a linear form on 
the space D = D(R) of the C™ functions of compact support, satisfying 
certain conditions of continuity; since the exponentials are not of compact 
support, the standard formula makes no sense unless JT’ is a bounded Radon 
measure 4 (Chap. V, n° 31, Example 1) on R; one may, in this case, integrate 
every bounded continuous function with respect to j (same method as for the 
measure dx: Chap. V, n° 22) and then define 


aly) = / eleyd(a). 


In the general case, the problem would have an immediate answer if we 
knew that f+ f maps D bijectively onto D: we would then define T so as 
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to obtain the distribution f(y)dy if dT(x) = f(x)dz, i.e. by the formula 


(32.1) plu)aF(y) = / ae)aT(2), ie. Fy) =T(9), 


directly inspired by (30.1). 
Alas, the Fourier transform of a function of compact support is never of 
compact support. In this case 


fw = / f(e) exp(—2miey)de = S>(—2miy)l| / 2" f(«)de 


since we are integrating over a compact set / a series that is normally conver- 
gent on kK. Here we may even assume y complex, so that f is the restriction 
to R of an analytic function on C, i.e. of an entire function. The principle of 
analytic continuation then shows that, for f regulated and of compact sup- 
port, f cannot be of compact support (or zero on a nonempty open interval) 
unless f = 0, which, for f € D (and even for f continuous: Theorem 26), 
implies f = 0. The situation is not what we met a propos Fourier series. 

To cut through this dilemma, Schwartz had to introduce a particular class 
of distributions and, to do this, to substitute for D the space S endowed 
with a suitable topology. If one wants to define the Fourier transform of a 
distribution T by formula (1) for y € D, one has to be able to define the 
value of T on the Fourier transforms of the y € D, i.e. on functions which 
are in S but not in D; supposing this point achieved, one again has to verify 
that the linear form y + T(¢) so obtained is continuous. The solution is 
then (i) to endow S with a topology making the map f +> f of S into S 
continuous, (ii) to restrict oneself to the distributions T : D — C which can 
be extended to continuous linear forms S — C. 

Consider now the first problem. For every f € S the numbers 


(32.2) Npa(f) = sup |x? (e)}, 
are finite by definition. Clearly 


Nopa(f +9) < Noa(f) + Np,a(g) 


and Npq(cef) = |c|Np.q(f) for every constant c; furthermore, it is clear that 
Npq(f) = 0 only if f = 0; each function Np 4 is therefore a norm on the 
vector space S. For every r € N the function 


(32.3) Nf) = 35 Nol) 


(no connection with the norms N, of integration theory) has again the same 
properties and N, < N,41. One now defines a topology on S by calling every 
set defined by an inequality 
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Nf =o) <p; 


where p > 0 and r € N are chosen arbitrarily, a “ball of centre f”, and 
declaring that a subset U of S is “open” if for every f € U the set U contains 
a ball of centre f (see°? the Appendix to Chap. III, n° 8). Convergence in S 
can then be translated into the condition 


(32.4) lim N;(f — fn) = 0 for every T; 
equivalently, one demands that, for any p and gq, 
(32.4’) lim x? [f° (2) - f(a)| =0 uniformly on R. 


This allows one to speak of continuous functions on S, for example of 
continuous maps from S into S. If U is such a map, denoted f +> U(f) or 
U f according to the case and to the author, one has, to express the continuity 
of U at a “point” fo of S, to write that for every ball B of centre go = U fo 
there exists a ball B’ of centre fo such that U maps B’ into B; in other words 
that, for any r € N and e > 0, there exists an r’ € N and an é€’ > 0 such that 


Np (f — fo) <e’ ==> N,(Uf — Ufo) <E. 


If U is linear, the most frequent case, it clearly suffices to express conti- 
nuity for fo = 0. If U takes its values in C, one replaces the inequalities 
N, (Uf —U fo) < by the single condition |U f — U fo| < e. 

Exercise — Show that f +> f? is a continuous map of S into S. 

With these definitions, one sees immediately that differentiation D: f > 
f’ is a continuous map of S into S; indeed 


Nol) = sup jeer ery(a)| = Nop,qt+ilf), 


whence the inequality 


(32.5) N,(f') < Nrai(f) 


which yields the result. 

Similarly, the operator M, multiplication by the function —27iz, maps 
S linearly into S and is continuous. When one replaces f(a) by «f(x) the 
function f(a) is replaced by xf (x) + qx f(x), whence 


Nog Mf) = 2a.sup aP tl f(D (x) + ga? fI-) (2) < 
2aNp+1,4(f) + 2mqNp,q-1(f); 


°° One might also consider the sets Np,q(f—g) < p without changing the topology; 
using the N,, is technically a little easier. On the other hand, note that the family 
of norms N; or Np,q is countable, so one could define the topology of S using a 
single distance (Appendix to Chap. III, n° 8), so in fact S is a metric space, and 
moreover complete (exercise!); but it is not a Banach space: the topology of S 
cannot be defined by a single norm. 


IA 
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a finite result - whence Mf € S — and implying the estimate 
(32.6) N,(Mf) < erNr+i(f) 


with a constant c, whose exact value is of little importance, because (6) is 
enough to establish the continuity of M. 

Asa map of S into S, the Fourier transform F' is also continuous in each 
sense. To see this without much calculating, first remark that 


Np.a(f) = No(M?D%f) = ||M?D"f\llp 


and then Ny.q(f) = No(M?D°Ff) = No(M?FM‘f) = No(FD?M“f) by 
the “commutation formulae” (31.2”) and (31.6). Now, in general, 


MFP) =swp| | seetanae| < f Fla)lae = If 


since the function (? + 1) f(a) is bounded by No(f), by (3), one finds 


No(Ff) < No(f) if (2? +1)" de, 


with a convergent integral whose exact value, c = 7, is not important. It 
follows that 

Np.q(f) = No(FD?M®f) < cN2(D?M"f); 
on applying (5) p times to the function M‘f one finds a result < Np+2 (M‘f) 
up to aconstant factor, and by applying (6) q times to f one obtains a relation 
of the form Nip) < Np+q+2(f) up to a constant factor. Remembering the 
definition (3) of N,, we finally have 


(32:7) N,(f) < ¢-Nr+2(f) 


where cj. is a new constant. This proves the continuity of the Fourier trans- 
form. Since it is bijective and quasi identical to its inverse map by virtue 


oN 


of the relation f(a) = f(—x), we conclude that the Fourier transform is a 
bijective and bicontinuous map of S onto S, in other words what in topology 
one calls a homeomorphism (linear too) of S onto S. 

With their systematic recourse to the operators D, M and F, these cal- 
culations can appear a little abstract. But to write explicitly the integrals 
and derivatives which they mask would be even less enticing. 


We can now return to distribution theory. Following Schwartz, we will 
call any continuous linear form T': S — C a tempered distribution on R. The 
inequality |T(f)| < ¢ has to hold for every f € S “close enough” to 0; this 
means that there exists an r € N and a 6 > 0 such that 
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N,(f) <6 => ITN <e: 


Continuity is expressed as follows: there exist an r € N and a constant 
M(T) > 0 such that 


(32.8) IT(f)| < M(L).N,(f) for every f € S; 


the argument is the same as in the normed vector spaces of the Appendix to 
Chap. III, n° 6. 

To justify the terminology, we have to show how T' defines a distribution 
in the sense of Chap. V, n° 34. Since S contains D it is clear that T defines a 
linear form on D, but again one has to prove continuity. If one works in the 
subspace D(K‘) of the y € D vanishing outside a compact subset K of R one 
has 


Npq(¥) = sup |z?p (2) | ce e(K)?.|[p || 


where c(/c) is the upper bound of |a| on AK. One deduces that 


N,(p) $ er(K) (hell +--+ Ihe 1) = eK) [hell 


in the notation of Chap. V, (34.3), with again another constant c,(K) 
depending only on K and on r. The inequality (8) then shows that the 
restriction of T to the subspace D(K) satisfies the continuity condition 
IT(¢)| < Mx (T). ||gl|° demanded of a distribution in Chap. V, (34.6). 

It is equally necessary to show that two tempered distributions cannot 
define the same distribution on D unless they are identical°!. By difference, 
it is enough to show that if T(y) = 0 for every y € D, then also T(f) = 0 
for every f € S. Since T is a continuous linear form on S it is enough to 
exhibit a sequence f, € D which converges to f in S, i.e. to show that D 
is “everywhere dense” in S, like Q in R, like the trigonometric polynomials 
in the space of continuous functions on T, like the usual polynomials in the 
space of continuous functions on a compact interval, etc. 

So let us start from a function y € D equal to 1 for |x| < 1, for example the 
function employed in Chap. V, n° 29, to prove the existence of C'°° functions 
having arbitrarily given derivatives at a point. Let us put y,(x) = y(a/n), 
a function equal to 1 for |z| < n. We shall see that, for every f € S, the 
fn(®) = Yn(x) f(x), which are clearly in D, answer the need, in other words 
that 


(32.9) lim N,(f — f~n) =0 for every r EN. 


This is equivalent to saying that all the functions M? D4( f—fy,,) converge 
to 0 uniformly on R. Now, by Leibniz, 


51 The reader may accept the result, which is not of serious importance in what 
follows. 
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(32.10) M?D%(f— fon) = M?[D%f —(D%f.yy +... + f D*pn)] 

= M?(1— 9,) D'f — M*(...) 
where the terms inside the sign (...) contain derivatives of yn, i-e. functions 
of the form n-*y™) (a/n) with 1 < k < q. Such a function is everywhere 


bounded in modulus by n~*" || D* »||, so that the sum of the terms considered 
is, for every x, bounded in modulus by 


d= n-* ||D¥ gl .|D** F(@)| ; 
1<k<q 


the signs ? denote binomial coefficients of no importance. If one applies the 
operator M? of multiplication by (—27ix)? to these terms one obtains a 
function 


Stas ||D*¢|| ; jan De Fa) 


with other coefficients ? independent of f and of n. In passing to the sup for 
xz € R one finds a result less than 


do ?n-* || D¥ oll .N-(F) 


where r = p+q. Since the sum is over the k € [1, q] and since n~* < 1/n, the 
final result, up to a constant factor independent of y and of r, is bounded by 
N,(f)/n. For f given, it is therefore O(1/n). 

It remains to examine the term M? (1 — y,,) D%f in (10). Since yn(x) = 1 
for |2| <n, this term vanishes for |a| < n. Ignoring the factors —277, its 
uniform norm on R is then in fact equal to 


[1 — Gn(#)| |x? D7 f(x)] . 
x|>n 


Now |1 — Yn(2)| < 1+||y|] = c. Since f € S the function |x?+!D7f(x)| tends 
to 0 at infinity, so is bounded on R; we deduce estimates of the form 


|x? D7 f(x)| S Cpq/|2| 


valid for every x € R. The sup for |z| > n is thus, also, O(1/n). 
Combining these two results, we see that 


|| M?.D4 (f — fen)|| = OC/n) 


for any p and q, which proves that f = lim fy, in the topology of S, qed. 


This done, it is immediate to define the Fourier transform T =FT ofa 
tempered distribution T: one puts, in Leibniz’ notation, 


(32.11) i fy)aF(y) = / fla)ar (x) 
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or, in that of the inventor, 
(32.11’) T(f)=T(f) for every f ES. 


Since the map f > f of S in S is continuous, so likewise is f + T(f), and 
so is f + T(f), whence one obtains a tempered distribution. 

One may also, as in Chap. V, n° 35, define the derivative — again tempered 
— of T by 


(32.12) T'(f) = -T(f’) for every f ES, 


and iterate the operation. To calculate the derivative DT of T one has to 
write 


DT(f) = —T(Df) = -T(FDf) 


where F' is the Fourier transform in S; but (31.5’) shows that FDf = —MF'f; 
thus 


(32.13) DT(f) = T(MFf). 
Putting DT = S, this can be written 


/ f(a)d8(«) = i: (—2niy) f(y)aT(y) = - / fly) 2riyaT(y). 


Thus we see the distribution “of density 27iy with respect to dT (y)” appear; 
if T were of the form p(y)dy with a reasonable function p we would thus 
obtain the distribution —27iyp(y)dy. So it is natural to write MT for the 
distribution —27iydT(y), the ordinary product of T by the function —27iy; 
it is again given by°? 


(32.14) MT(f) =T(Mf) for every f € S. 
This done, (13) can be written 
(32.15) DFT(f) = MT(Ff) = FMT(f) 


by definition of the Fourier transform FMT of MT. In other words, the for- 
mula DF = FM remains valid for tempered distributions. One can show 
similarly that MFT = —FDT: the Fourier transform exchanges the op- 
erators of differentiation and of multiplication by —277x in the context of 
functions or of distributions. 


>? The formula has a meaning only because multiplication by 27iy maps S con- 
tinuously into S. One may define p(y)dT(y) for every function p which is C% 
and such that f + pf maps S into S. This assumes that p and its successive 
derivatives do not increase more rapidly at infinity than powers of x (“tempered 
functions” ): the product of a function “of slow increase” by a function “of rapid 
decrease” is again of rapid decrease. 
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If for example f is a regulated function which is O( |x|" ) at infinity for an 
integer N > 0, then the formula 


has a meaning for every y € S and defines a tempered distribution; its Fourier 
transform is, by definition, the Fourier transform of f; it goes without saying 
that it is not a function in general. 

In particular let us take f(z) = x? with p € N. Then 


F;() = T+(@) = i} Caray ter GES: 


multiplying by (—277)?, one makes the function M? Fy = (—1)? FD? ap- 
pear in the integral. Then 


(277%) PT (yp )= | FD 


But since D?y is in S we may apply the Fourier inversion formula to it, 
whence _ 
(2ni)? Ts(y) = D? (0) = 6 (D? 9), 


where 6 is the Dirac measure at the origin, clearly a tempered distribution. 
In view of definition (12) of the derivative of a distribution, the result can be 
written 


(2ni)? T; = 6), 


the derivative of order p of the distribution 6. For p = 0, we see that the 
Fourier transform of the sae 1 is the Dirac measure at the origin: this is 
exactly what formula f(0) = [ f(y)dy, valid for f € S, means. 

In conclusion we ns that all this generalises to functions of several 
variables. See for example the excellent Chap. 3 of Michael E. Taylor, Partial 
Differential Equations. Basic Theory (Springer, 1996) or the ultracondensed 
exposition of Lars Hérmander, The Analysis of Linear Partial Differential 
Equations, Vol. 1 (Springer, 1983). 
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in International Cooperation in Space Operations and Exploration, vol. 27, Science 


Extrait de C. Stark Draper, “Critical Systems and Technologies for the Future”, 
and Technology, 1971 (American Astronautical Society). 


Postface 


Science, technology, arms 


The text below is only a part of the postface that was announced in the 

preface to volume I; the full text, with references to sources, is substantially 

longer than the French version, already 90 pages long. It will be available to 

interested readers on the Internet, at the following address: 
www.springer.online.com/de/3-540-20921-2 

Readers who wish to understand why a mathematics textbook includes the 

text below will find explanations in the preface to volume I. 

I have tried to be as pedagogical as possible, but since this postface deals 
with many topics far removed from mathematics, it will, of course, require 
some work and good will from the reader to understand it. 

Many sources have been used, and all of them will be found in the internet 
version. A few have been mentioned in the printed text. 

Italics have been used for verbatim quotations in the main text. 


§ 1. How to fool young innocents 


The H-bomb was born in September 1941 at Columbia University in New 
York during a conversation between Enrico Fermi and Edward Teller. The 
explosion of an atomic bomb based on the fission of U-235 or Pu-239 nuclei 
could generate the tens or hundreds of million degrees necessary for the fusion 
of hydrogen nuclei, which in turn would generate amounts of energy hundreds 
of times greater than that of the atomic bomb itself. This was nothing more 
than a very rough idea, but Teller and others already knew (or believed) by 
1942 that, if a 30 kg mass of U-235 


is used to detonate a surrounding mass of 400 kg of liquid deu- 
terium, the destructiveness should be equivalent to that of more than 
10,000,000 tons of TNT [the standard military explosive]. This should 
devastate an area of more than 100 square miles. 


Yet the development of the A-bombs which destroyed Hiroshima (U-235) and 
Nagasaki (Pu) was top priority during the war, so that nothing much hap- 
pened for several years although a few people around Teller continued their 
theoretical studies of the problem; after a team of physicists reviewed the 
issues in the spring of 1946, even Teller went back to theoretical physics at 
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Chicago. Calculations were very difficult to carry out, they neglected phys- 
ical effects that opposed the fusion reaction, the choice of which isotopes 
of hydrogen to fuse was not easy, and the geometrical configurations they 
were drawing up could not work or, if they did, could not lead to the ul- 
timate weapon, namely something with theoretically unlimited power. Last 
but not least, experimental verification of the computations was impossible 
short of exploding an actual weapon. In addition, many influential physicists 
were against the development of a weapon which they viewed as being far 
too powerful and which would most likely be imitated by the Soviet Union 
sooner or later. 

The situation changed dramatically after the announcement in September 
1949 by President Harry Truman of a first secret (but detected) Soviet atomic 
test. The General Advisory Committee (GAC) of the Atomic Energy Com- 
mission (AEC, now part of the Department of Energy, DoE) was convened 
to deal with the new situation at the end of October. Basically for ethi- 
cal reasons, the GAC members (scientists J. Robert Oppenheimer, Arthur 
Compton, James Conant, Enrico Fermi, Lee A. DuBridge, Isidor I. Rabi, 
Cyril Stanley Smith, as well as the Bell Labs president, Oliver E. Buckley, 
and Hartley Rowe, an engineer) were unanimous in their opposition to the 
development and production of the H-bomb, though they were not against 
further theoretical studies; they recommended the production of more fission 
bombs — new types under development, up to 500 kilotons (KT), were deemed 
powerful enough to deter the Soviets -, including “tactical” ones (for use in 
Europe...), and they recommended providing by example some limitation on 
the totality of war; when briefed by Oppenheimer, the tough Secretary of 
State, Dean Acheson, a friend and admirer, replied: How can you persuade a 
paranoid adversary to disarm “by example” ? Other scientists, like Teller and 
Ernest Lawrence who were not GAC members, were also strongly in favor 
and briefed the president of the Congressional Committee on Atomic Energy 
and top men in the Air Force, who began to call for it. Three of the AEC 
administrators (including the AEC President) were against it, and the other 
two for it, including Lewis Strauss, a most influential and conservative Wall 
Street tycoon who, like Teller, was as “paranoid” as Joe Stalin, and did not 
hesitate to go straight to Truman. The H-bomb supporters rejected the idea 
that America might come out second in the H-bomb race; and in an America 
again made fiercely anti- Communist by the Soviet domination of Eastern 
Europe, by the 1948 attempt to blockade Berlin, and by the “loss” of China 
to the Communists in 1949, the overwhelming majority of people also wanted 
supremacy over, not parity with, the Soviet Union. Furthermore, the near to- 
tal American demobilization in 1945 and the rejection of Universal Military 
Training meant that reliance on atomic weapons was America’s only means 
of deterring, slowing down, or resisting the onrush of a Red Army which, 
after demobilizing, still retained about three million men and compulsory 
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military training, even though the said onrush was considered by people in 
the know to be quite improbable within five years. 

As President Truman said at the time, there was actually no decision to 
make on the H-bomb: he shared these arguments and, at the end of January 
1950, after three months of totally secret discussions involving about one 
hundred people, he publicly announced that the development of the H-bomb 
would continue; he also forbade people connected with the AEC, including 
GAC members, to discuss the subject in public. 

In early February, thanks to the partial decrypting of Soviet wartime 
telegrams, the unfolding of the Fuchs affair in Britain a few days earlier 
proved that the bright ex-German Communist physicist sent by the British 
to Los Alamos in 1943 had transmitted to the Soviets not only essential data 
on the A-bomb, but also most probably what was known on the future H- 
bomb up to April 1946: he had even taken a patent out on it, in common 
with von Neumann! The Soviets thus knew America was on the H-bomb trail, 
and America knew that the Soviets might also be working on it, as Teller 
had claimed — rightly, but without proof — long before. In March 1950, on 
the advice of the military, who did not need this new piece of information to 
make up their minds, Truman, this time secretly, made H-bomb production 
a top priority. 

The correct physical principles were not even known. Numerical calcula- 
tions carried out by mathematicians John von Neumann at Princeton and 
Stanislas Ulam at Los Alamos, and performed partly on the new but insuffi- 
ciently powerful electronic machines, confirmed that Teller’s ideas could not 
lead to the weapon he had been dreaming of since 1942; Teller’s optimistic 
calculations still relied on incorrect hypotheses or missing data. One (the- 
oretical) version of the weapon under consideration in 1950, which would 
develop a power of the order of 1,000 megatons, was some 30 feet long, and 
a stunning 162 feet wide; the fission trigger alone weighed 30,000 pounds. 
Technical follies, as Freeman Dyson would later say. 

Anyway, developing the weapon had to be done at the Los Alamos lab- 
oratory where the A-bomb had been developed and where a reduced team 
had remained or had been recruited since Hiroshima. Although the outbreak 
of the Korean war led many top physicists to join the project, many mem- 
bers were on Oppenheimer’s side as Teller knew full well, and he believed 
they were not enthusiastic enough to succeed. Supported again by Ernest 
Lawrence, the Air Force, and key Congressmen, Teller asked for the creation 
of a rival laboratory in 1950, but his request was denied by the Atomic En- 
ergy Commission. Teller was desperate at the end of 1950 and no longer sure 
a true H-bomb, with arbitrary large power, could be made. 

But in January 1951, Ulam devised a new geometric configuration: to 
separate completely the atomic triggerfrom the material to be fused. It was 
seized by Teller who found an entirely new way to make the fusion work 
before the bomb blew up: the near-solid wave of neutrons flowing from the 


390 Postface 


atomic explosion was too slow; instead, his idea was to use the X-ray burst 
from it to generate the necessary temperature and pressure. During a meeting 
at the Princeton Institute for Advanced Research (which Oppenheimer now 
headed) in June 1951, everyone enthusiastically agreed that this was the 
solution, and Teller got the laboratory he had asked for in September 1952. 
In November 1952, a test of the principles, using liquid deuterium and a good 
sized refrigeration installation, produced the 10 megatons (MT) predicted in 
1942; it also vaporized a small island in the Pacific ocean. In April 1954, 
several tests of near-operational weapons using lithium deuteride, an easily 
stored white powder, produced between 10 and 15 MT - two or three times 
more than predicted, because one of the reaction phases had been overlooked. 
Operational weapons (10-15 MT) went aboard giant B-36 bombers from the 
end of 1954 to 1957; later ones never exceeded 5 MT and most were in the 
hundreds of KT range. All of these successes, and the great majority of later 
achievements too, were the work of Los Alamos people “lacking enthusiasm”. 
The first true Soviet H-bomb was tested in November 1955 and produced 
about 1.6 megatons. 

Set up at Livermore, not far from Berkeley, Teller’s laboratory is now 
called the Lawrence Livermore Laboratory (LLL) and has been managed, at 
least officially, by the University of California since 1952, as Los Alamos has 
been since 1943. All American nuclear weapons were invented at these two 
places; while this still remains Los Alamos’ basic activity, Livermore later con- 
centrated a large part of its work on much more innovative scientific-military 
projects, as will be seen below. Lawrence won a Nobel prize for his inven- 
tion at Berkeley in the 1930s of the first particle accelerators (cyclotrons). 
To a large extent, this was made possible by philanthropists attracted by 
the potential medical uses of radiation or artificial radio-elements available 
much more cheaply and abundantly than radium. During the war, Lawrence 
initiated and headed a massive electromagnetic isotope-separation process 
inspired by his cyclotrons; you can gauge Lawrence’s influence from the fact 
that the Treasury Department lent him over thirteen thousand tons of silver 
to wire his “calutrons”, despite an endless series of unexpected technical prob- 
lems which brought operations to a complete halt as soon as the war ended. 
They nevertheless performed the final enrichment, at 80% of U-235, of much 
of the partly enriched uranium obtained from another massive factory, where 
uranium hexafluoride — a very nasty gas — was blown through thousands of 
porous metallic “barriers”; the very primitive Hiroshima bomb used some 60 
kg of the final product. Together with Oppenheimer, Fermi, Arthur Comp- 
ton and Conant, as well as the Secretaries of War and State, Lawrence had 
participated in the June 1945 top- level discussions concerning the use of 
the first available bombs. They had also recommended a well-financed re- 
search program in nuclear physics, military and civilian applications, as well 
as weapons production. 
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It was to this most influential operator, whose Berkeley Rad Lab had 
strong connections with Los Alamos, that the Atomic Energy Commission 
entrusted in 1952 the task of setting up a new development center for the H- 
bomb. Livermore needed a director, and Lawrence chose one of his assistants, 
Herbert York, then 30 years old. 

After Sputnik (1957), York took charge for a while of all American mili- 
tary research and development. Health problems forced him to cut down on 
his activities, and he “retired” to a California university, while still partici- 
pating in negotiations and meetings on arms control. From 1970 on, he wrote 
articles and books about the arms race, the absurdity and danger of which 
he could now clearly see. 

In 1976 he wrote a short book, The Advisors, recounting the development 
of the thermonuclear project and, in particular, the discussions which had 
taken place at the end of 1949 on the opportunity to launch a H-bomb devel- 
opment program. His book reproduces in full the recently declassified report 
in which the AEC’s General Advisory Committee explains the practical and 
ethical reasons against it. 

With a rare frankness, York discloses the reasons which led him to par- 
ticipate in the project after the start of the Korean War (which led some op- 
ponents of the H-bomb, like Fermi and Bethe, to change their minds). There 
was first the growing seriousness of the cold war, much influenced by my very 
close student-teacher relationship with Lawrence, a fierce anti-Communist like 
Teller, Ulam, and von Neumann. There was also the scientific and technical 
challenge of the experiment itself: it’s not every day you get the opportunity 
to explode the equivalent of ten million tons of TNT for the first time in 
history (it was actually done by Los Alamos). There was also, and perhaps 
most importantly as every young scientist can understand, 


my discovery that Teller, Bethe, Fermi, von Neumann, Wheeler, 
Gamow, and others like them were at Los Alamos and involved in 
this project. They were among the greatest men of contemporary 
science, they were the legendary yet living heroes of young physi- 
cists like myself, and I was greatly attracted by the opportunity of 
working with them and coming to know them personally. 


Moreover, 


I was not cleared to see GAC documents or deliberations, and so I 
knew nothing about the arguments opposing the superbomb, except 
for what I learned second hand from Teller and Lawrence who, of 
course, regarded these arguments as wrong and foolish. (I saw the 
GAC report for the first time in 1974, a quarter of a century later!) 


In less than one page, you have something similar to the corruption of a 
minor taking place in the scientific milieu: you are told that the enemy is 
threatening your country, the scientific problem is fascinating, great men 
you admire set the example, other great men you don’t know personally are 


392 Postface 


opposed to the project but their arguments are top secret, those great men 
who are luring you carefully refrain from honestly telling you what these 
arguments are, and, anyway, you'll be able to read the official documents in 
25 or 30 years if you are American, in 50 or 60 if you are French or British, 
and, at the earliest, after the fall of the regime if you are a Soviet citizen. If 
you are still alive, your delayed comments will have no impact whatsoever 
because the project in which you participated was completed decades before, 
and its justifications have perhaps changed radically in the meantime. 

This had already been seen in the A-bomb project: physicists were told (or 
claimed) in 1941 that the A-bomb was needed before the Nazis got one, it was 
discovered in May 1945, if not before, that they were years behind, but the 
bombs were still dropped: over a thoroughly defeated Japan. Quite a number 
of participants felt they had been fooled, even though they did not know, 
as we now do, that three weeks after Hiroshima, the Air Force sent General 
Groves, head of the Manhattan Project, a list of two dozen Soviet cities and 
asked him to provide the weapons (which was not done until 1948), while 
Stalin was giving absolute priority to his own atomic project. And nobody 
then — except perhaps Groves — imagined that tens of thousands of bombs 
would eventually be produced. 


Main references: Herbert York, The Advisors. Oppenheimer, Teller, 
and the Superbomb (Freeman, 1976), Stanislas Ulam, Adventures of 
a Mathematician (Scribner’s, 1976), Richard Rhodes, The Making 
of the Atomic Bomb (Simon & Schuster, 1988) and Dark Sun. The 
Making of the Hydrogen Bomb (Simon & Schuster, 1995), Gregg 
Herken, Brotherhood of the Bomb. The Tangled Lives and Loyal- 
ties of Robert Oppenheimer, Ernest Lawrence, and Edward Teller 
(Henry Holt, 2002), Peter Goodchild, Edward Teller. The Real Dr. 
Strangelove (Harvard UP, 2004), David C. Cassidy, Oppenheimer and 
the American Century (PI Press, 2005). 


York may not have been alone in this kind of situation; as Gordon Dean, 
AEC president 1950-1954, said at the Oppenheimer security hearing in 1954: 


We were recruiting men for that laboratory [Livermore], I would say 
practically all of whom came immediately out of school. They were 
young Ph.D.’s and some not Ph.D.’s (...) Under Lawrence’s adminis- 
tration, with Teller as the idea man, with York as the man who would 
pick up the ideas and a whole raft of young imaginative fellows you 
had a laboratory working entirely — entirely — on thermonuclear work. 


Livermore’s then two divisions (thermonuclear and fission) were headed by 
Harold Brown, then 24, and John Foster, then 29; they both were later to 
head Livermore, then all military R&D, and even the Department of Defense 
(DoD). I don’t know whether, once past the age of innocence, some of these 
“young imaginative fellows” reflected on their past as York did. 
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I do know of other similar cases though. Theodore B. Taylor (1925-2004), 
on hearing about Hiroshima, vowed never to have anything to do with atomic 
weapons, but he studied physics. In 1948, believing he was working for peace, 
he joined Los Alamos where he developed a fascination and a gift for improv- 
ing atomic weapons. He invented the best A-weapons of the time, including a 
500 KT fission weapon which, in May 1951, succeeded in fusing a few grams 
of deuterium; he also became an expert in predicting the effects of nuclear 
weapons. He left in 1956 for General Atomic (founded by one of Teller’s col- 
leagues) and the design of nuclear reactors, then headed the development of 
a spaceship propelled by multiple small atomic explosions and able to send 
people to Mars and beyond — the Nuclear Test Ban treaty prohibiting at- 
mospheric tests killed that project in 1963; in 1964 he was put in charge of 
the maintenance of nuclear weapons, in 1966 he resigned and worked for a 
while with the international Vienna agency (AIEA) responsible for control 
ling the civilian nuclear energy business. His initial taste for weapons turned 
into its very opposite, notably after a visit to Moscow when, looking at the 
crowd in Red Square, he remembered he had helped the Air Force select the 
weapons best adapted to targets around the city, the Kremlin being most 
probably number one on the list. He spent the rest of his life advocating the 
abolition of nuclear weapons and nuclear energy which, he believed, would 
lead to an uncontrollable proliferation of weapons and even to their use by 
terrorists, a prospect he predicted in 1970 by emphasizing that the World 
Trade Center building could easily be felled by a small atomic explosion on 
its ground floor. 

Recruiting young imaginative fellows at Livermore and other places is still 
going on, of course. William Broad, a New York Times science journalist who 
spent a week there in 1984 with a very special “O group” of young physicists 
twenty to thirty years old, explains in Star Warriors the role of the Hertz 
Foundation, founded shortly after Sputnik by Hertz Rent-a-Car’s patriotic 
owner in order to maintain US technological preponderance (and to show 
his gratitude to a country which turned a poor immigrant into a very rich 
man). Every year the Foundation allocates about twenty five fellowships, 
valid for five years, to outstanding students; some of these are invited to 
spend a summer (or several years) at Livermore while preparing for their 
Ph.D. elsewhere. Those Broad met were asked to put their energies into 
problems at the cutting edge of technology with a not so obvious military 
interest: to build an optical computer using laser lines instead of electrical 
connections, to design from scratch and to miniaturize a supercomputer, 
to devise an X-ray laser, to elaborate a credible model of an atomic bomb 
using only published literature, etc. The group leader, Lowell Wood (who 
still sits on the Foundation Board together with several other Livermore or 
Los Alamos people), explained that: 


The best graduate students tend to do very marvelous work because 
it’s a win-or-die situation for them. There is no graceful second place. 
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If somebody else publishes the definitive results in the area, they go 
back to zero and start over (...) They don’t realize how extremely 
challenging these problems are. So they are not dismayed or demor- 
alized at first. By the time they begin to sense how difficult the 
problems are, they’ve got their teeth into them and made sufficient 
progress so that they tend to keep going. Most of them win. They 
occasionally lose, which is very sad to see (p. 31). 


One of them, Peter Hagelstein, remembers his arrival in Teller’s kingdom in 
1975 when he was 20 years old: 


The lab itself made quite an impression, especially the guards and 
barbed wire. When I got to the personnel department it dawned on 
me [!] that they worked on weapons here, and that’s about the first 
I knew about it. I came pretty close to leaving. I didn’t want to have 
anything to do with it [and his girlfriend was militantly opposed 
to it, which eventually destroyed their relationship]. Anyway, I met 
nice people, so I stayed. The people were extremely interesting. And 
T really didn’t have anywhere else to go. 


Hagelstein was asked to study the X-ray laser. He first spent four years, at the 
rate of 80-100 hours a week, learning the physics and doing computations with 
a very powerful program of his own. A senior Livermore physicist, George 
Chapline, had been trying for years to find a solution by using a nuclear 
explosion to get the energy needed to “pump” the laser (it is proportional to 
the cube of the frequency, which for X-rays is about 1000 times that of visible 
light). A first underground test in September 1978 was a failure because of 
a leak in a vacuum line. On Thanksgiving Day 1978, some senior physicists 
— including Wood, Chapline and an unwilling Hagelstein — were summoned 
to Teller’s home to discuss the problem; Hagelstein was ordered by Teller to 
review the calculations done for Chapline — nothing more, but nothing less 
— and he had no choice but to comply. By the next day, he had to tell Wood 
there was a flaw in Chapline’s theory, which put him in direct competition 
with Chapline. He found new ideas which he once dropped at a meeting in 
1979, too tired after a 20-hour working day to realize what he was doing. They 
were seized upon at once and, he told Broad, he had [his] arm twisted to do 
a detatled calculation , under political pressures like you wouldn’t believe . To 
his despair and with some prodding from Wood and Teller, his calculations 
and new ideas proved more and more promising, and in 1980 an underground 
test of his and Chapline’s new designs proved Hagelstein’s method was by 
far the better. He then had access to Livermore’s gigantic laser lines, and his 
laser, though still virtual, got a name: Excalibur. 

Hagelstein tells us of political pressures; no wonder. On the political side, 
for several years before Reagan’s election, some very influential people — the 
Committee on the Present Danger — had been claiming that the Soviets were 
spending far more on defense than even the CIA said, and were re-arming 
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to full capacity. As a matter of fact, since 1975 they had been deploying 
a few hundred new strategic missiles with multiple independent warheads 
(MIRV) (deployment of 840 American Minuteman-III MIRV missiles, and of 
640 Poseidon submarine-launched similar missiles, had started in 1970 and 
1971, respectively). They were also deploying very accurate middle-range SS- 
20s aimed at strategic targets in Western Europe and China. Their output of 
basic industrial goods (steel, coal, cement, etc.) was 50 to 100% higher than 
America’s (but the American economy was converting to an “information 
society” far more efficient than Stalin’s successors’ taste for steel). They were 
discovering huge fields of oil and natural gas from which they got plenty of 
foreign currency, allowing them to buy (mostly American) grain and, much 
worse, advanced foreign machinery in spite of the US embargo on high-tech 
goods. “Marxists” were seizing power in several African states; unrest in 
Poland was repressed by the Polish army to avoid a Soviet intervention; the 
Red Army had intervened (unwillingly at first) in Afghanistan to defend 
the new Communist regime against its enemies, which many interpreted as 
a first Soviet step towards the proverbial Persian Gulf “warm waters” the 
Tsars had never managed to seize. The American deployment in Europe of 
equally dangerous American Pershing ballistic missiles and Tomahawk cruise 
missiles in answer to the SS-20s was opposed by strong “peace movements” 
that were suspected of being infiltrated by Soviet agents since, of course, 
ordinary German citizens were deemed too stupid to worry for themselves 
about these displays of atomic fire power. In short, the world had entered 
what became known as the New Cold War . 

Thirty two members of the Committee on the Present Danger, including 
Reagan, occupied high administration offices after he came to the White 
House in January 1981. He immediately started to re-arm — the DoD budget, 
mostly financed by foreign capital attracted by high interest rates, went up 
from 181 BD in 1978 to 270 in 1984 in constant dollars -, and he continued to 
taunt the Soviets in speeches that culminated in his famous “Evil Empire” 
statement in 1983. However, many people in Washington, including Reagan 
himself, believed that in spite of its apparent strength, the USSR was under 
tremendous economic pressure with a grossly inflated military sector and a 
grossly underdeveloped civilian sector. They thought that a new round in the 
arms race would bankrupt the Soviets, or force them to agree to significant 
cuts in strategic armaments, or both. 

There were already people in America trying to sell untested and wild 
anti-missile schemes, e.g. chemical lasers, 24 of which could supposedly de- 
stroy an entire fleet of Soviet missiles, or thousands of interceptors launched 
from hundreds of space stations. This led another bunch of conservative busi- 
nessmen who had nothing to do with nuclear weapons, but were close to Rea- 
gan, to found a High Frontier committee, including Teller who wanted to sell 
his X-ray laser right away; they wanted to reach the White House without go- 
ing through the Pentagon bureaucracy, where hard technical questions would 
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be asked, of course. In this way, Teller was able to recommend a Los Alamos 
friend as Scientific Advisor, George Keyworth, who in turn appointed him 
to the new White House Science Council. A High Frontier report was sent 
to Reagan, and they got a fifteen-minute audience in January 1982; Teller 
apparently did not attend. They claimed that the Russians were well ahead 
in technology (as Teller had claimed to promote his H-bomb project), that 
they were close to deploying directed-energy weapons in space, thus altering 
the world balance of power. They recommended that America launch a major 
program to counter the Soviet threat in order to substitute assured survival 
for assured destruction , which suited Reagan quite well. Hagelstein’s X-ray 
laser was the key to success and would be available within four years, followed 
by even more powerful versions. All of this rested on the secret results of a 
single test performed in a totally artificial underground environment. 

Reagan, however, asked Keyworth to gather a team of experts from his 
Science Council in order to review the project before the end of 1982, if 
only to get an idea of the price tag. During this year, peace movements 
in America and Europe drew hundreds of thousands of people (and many 
scientists) demonstrating against the new arms race; many American Con- 
gressmen agreed. In June, a group of Livermore scientists who had the re- 
sponsibility of continuing work after the first test, reported that the project 
would require ten more tests, six years, and 150-200 million a year to es- 
tablish reliably that this laser was scientifically possible; it would then have 
to be transformed into an operational space weapon, which would require 
still more engineering, money, and years. This made Teller furious and all 
the more convinced that, as had been the case with the H-bomb, the project 
needed a lot of hype to take off. After complaining on TV that he had not 
yet met with President Reagan, he got an audience in September; some of 
those present interjected so many questions that Teller (and Keyworth) felt 
the meeting had been a disaster. In December, the House rejected funds for 
the production of a new and widely criticized generation of missiles, the MX, 
which could be randomly moved underground among many silos, most of 
them empty, in order to fool the Russian MIRV missiles. In January 1983, 
Teller got an audience with the Chief of Naval Operations; he was convinced 
by Teller’s views and converted the Joint Chiefs of Staff; to them, it was at 
least a way of convincing Moscow of the sheer financial power and technical 
superiority of the US , as well as a new way to inflate the Defense budget since 
MX was becoming far too controversial. A meeting with Reagan in February 
1984 ended in agreement; the military believed this would lead to an orderly 
development project, but Reagan did not wait. In March, to everyone’s as- 
tonishment, he publicly announced his Strategic Defense Initiative (SDI, or 
Star Wars) project designed to protect the American people from Soviet mis- 
siles — a popular statement if ever there was one, which nevertheless did not 
placate the opposition. 
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In the meantime, in February 1983, a most famous nuclear physicist, Hans 
Bethe, had gone to Livermore, reviewed Hagelstein’s project and found it ex- 
tremely clever physics, which did not mean clever weaponry. After Reagan 
launched his Star Wars project, Hagelstein’s laser became the most publi- 
cized — and controversial — part of it though it was still, at best, years away 
from any kind of operational status; a second test in March was actually 
inconclusive due to a recording failure. The media explained that, propelled 
into space by a single missile, individually oriented towards enemy missiles, 
and “pumped” by a nuclear explosion, fifty X-ray lasers were expected to 
destroy as many targets. Many physicists, foremost among them Hans Bethe 
and Richard Garwin, were opposed to this new exotic hardware display and 
said so publicly, because the chances of success were poor for many reasons — 
the need for fantastically fast computers and communications (laser weapons 
would be launched from submarines after the Soviet attack was detected, 
they would have to spot missiles moving at a speed of four miles per second 
and then orient the laser rods before firing, etc.) -, because nobody knew 
whether the project would cost 150 or 3,000 billion dollars (BD) if successful, 
and because it would only lead to one more spiral in the arms race and/or 
could be easily defeated (as the Soviets at once remarked). Another official 
panel reviewing the project came to rather pessimistic conclusions, relegat- 
ing Reagan’s dream to the year 2000 or so, and calling for a less ambitious 
goal, while at the same time recommending one billion and top priority for 
the laser, and 26 billion over seven years for the various other projects: SDI 
had already acquired an immense political power by this time. During a pro- 
paganda tour of Europe in 1985(?), SDI chief, General James Abrahamson 
used plenty of sexy slides to explain it all at the Paris Ecole polytechnique (I 
attended); this was a major contribution to the students’ scientific education: 
they (and I) did not know a thing about X-ray lasers, but you can trust them 
to have “understood” everything within a week. 

A few days after a successful test in December 1983, Teller sent an 
overly optimistic report to Keyworth, without notifying anyone, not even Roy 
Woodruff, a senior Livermore physicist who was deputy director for weapons 
design and thus oversaw the X-ray laser group; Woodruff was furious and 
wrote a corrective letter, which was blocked by Livermore’s director. In the 
Spring of 1984, other objections arose. According to Los Alamos scientists, 
beryllium mirrors that were sending a fraction of the beam to recording in- 
struments contained oxygen which, excited by the beam, possibly increased 
the recorded brightness. The dispersion of the laser beam in space, the num- 
ber of space stations and the power of the explosions needed were also publicly 
criticized by independent scientists. But in Washington others noticed that 
Soviet negotiators — who had been working for years on arms reduction — 
were very concerned about this militarization of space and therefore might 
be more accommodating, others again thought SDI would be a good oppor- 
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tunity to wreck the 1972 ABM treaty which seriously limited the deployment 
of anti-ballistic missiles. 
By 1984, Hagelstein had lost his initial dislike for weapons: 


My view of weapons has changed. Until 1980 or so I didn’t want 
to have anything to do with nuclear anything. Back in those days I 
thought there was something fundamentally evil with weapons. Now 
I see it as an interesting physics problem. 


He did not have any illusions: 


I’m more or less convinced that one of these days we’ll have World 
War III or whatever. It’ll be pretty ugly. A lot of cities will get busted 


up. 


In October 1984, Hagelstein and a team of forty people realized at long last 
the first “laboratory” X-ray laser, using a 150-meter long laser line pumped 
by capacitors discharging ten billion watts; this success was still very far from 
the operational weapon Teller was promising Reagan. 

During this time, the Livermore group had devised the theoretical means 
to increase the laser power by several orders of magnitude, so that now 
“Super-Excalibur” lasers could be placed on a stationary orbit and still be 
able to kill missiles 20,000 miles away! At the end of 1984, Teller wrote 
through Wood to Paul Nitze, since 1950 the top expert in arms-control ne- 
gotiations: 


a Single X-ray laser module the size of an executive desk which ap- 
plied this technology could potentially shoot down the entire Soviet 
land-based missile force, if it were to be launched into the module’s 
field of view. 


Woodruff was again by-passed but learned of the letter; he again tried to 
send a corrective one, which again was blocked. However, in February 1985, 
he was allowed a two- hour meeting with Nitze, who said that it’s always 
good to get a bright skeptical mind on a problem . The initial results of a 
new and very elaborate test seemed so good in March that Teller’s constant 
lobbying did pay off: hundreds of millions were released. 

That same month, Mikhail Gorbachev came to power in the USSR, with 
a quasi-revolutionary program to transform the Soviet Union into a near- 
democracy and to terminate the arms race and the Cold War, which Reagan 
wanted too (but by other means). Although his scientists told him that SDI 
could be neutralized for 10% of the price to America, he decided to focus 
the US-Soviet arms-control talks on removing SDI in exchange for heavy 
cuts in missiles. Reagan met him in Geneva in November 1985 and, although 
the meeting was rather friendly, Gorbachev told him he should not count 
on bankrupting the Soviet Union or achieving military predominance, and 
that SDI would render impossible the expected 50% reduction in missiles. 
Reagan replied by extolling the virtues of defense, as usual. They continued 
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to correspond for months in the hope of getting some kind of agreement; 
many of Reagan’s aides and top military (not to mention Europeans, in- 
cluding France’s president Mitterand) were appalled by Reagan’s apparent 
willingness to dump American missiles provided he could keep SDI. 

In October, at the annual conference of Los Alamos and Livermore people 
on nuclear weapons, Los Alamos scientists reiterated in detail their skepticism 
over the test results or even the existence of the X-ray laser; this allowed most 
members of the X-ray laser group to understand for the first time that these 
objections were serious. And Los Alamos people accused Livermore managers 
of abdicating their prerogatives to Teller and Wood, who, of course, claimed 
Los Alamos were trying to sabotage their project for political reasons or out of 
rivalry. This was enough for Woodruff, who resigned from his position. Teller’s 
predictions, however, became somewhat more careful, and he emphasized 
that defense would be efficient even if it were only 20% effective because 
enough US missiles would survive to deter the Soviets attacking in the first 
place. 

In November, a new and very expensive test (30 MD) resulted mostly in 
failure. Some Livermore scientists, who were already exasperated by Wood’s 
authoritarian and sarcastic manner and by Teller’s constant meddling in their 
work, left the project; as one of them said in 1989, 


To lie to the public, because we know that the public doesn’t under- 
stand all this technical stuff, brings us down to the level of hawkers 
of snake oil, miracle cleaners and Veg-O-Matics. 


Although he dismissed Los Alamos objections, Hagelstein too was disgusted 
by Teller’s and Wood’s extravagant public claims and by the bad faith the 
main protagonists displayed; as he told Goodschild in 2000, I could not believe 
people behaved in that way . However, it is easy to understand why they did. 
These people with plenty of willpower had for decades been in charge of 
designing the awesome weapons on which US security was supposed to rest. 
They were under enormous political pressure, and billions of dollars had 
already been spent on or budgeted for their pet project. Their reputations 
and the laboratory’s were at stake. 

Hagelstein quit Livermore for the MIT Research Laboratory in Electron- 
ics, which had been conducting military research since 1945, and worked in 
quantum electronics and, later, “cold fusion”. This is a very controversial and 
to this day unproven method of generating energy at room temperature by 
means of fusion reactions among metallic compounds of hydrogen and deu- 
terium. His scientific reputation suffered greatly as a result. As to Woodruff, 
he was exiled to a tiny office (“Gorky West”) and his salary cut for sev- 
eral years, a good illustration of the contradicting ethics governing open and 
classified research; he joined Los Alamos in 1990. 

The conflict between the two laboratories surfaced in the newspapers, trig- 
gering another public but inconclusive discussion, since the relevant technical 
data were top secret. In 1986 several thousand scientists publicly pledged not 
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to participate in SDI in spite of the promise of exciting problems to solve 
and plenty of money for their laboratories. Politically, Teller won; he was 
supported by the military, by influential Congressmen, and by Reagan who 
understood nothing but trusted the “father of the H-bomb”; Teller hesitated 
neither to rely on Reagan’s faith, nor to use his own scientific self-confidence, 
reputation and authority to ruthlessly counter opponents. 

As for the Star Wars project, it survived until Bill Clinton’s election in 

1992. In January 1986, Gorbachev proposed to get rid of all Euromissiles on 
both sides, and to eliminate all nuclear weapons by 2000, provided America 
gave up developing, testing and deploying space weapons. Reagan proposed 
instead to reduce strategic warheads to 6,000 on each side (this was achieved 
four years later under George Bush) and to redress existing conventional im- 
balances. In July, Reagan proposed scrapping all ballistic missiles within ten 
years while continuing research on SDI which, when operational, would be 
made available to all (!). They had a second meeting in Reykjavik in Oc- 
tober 1986 during which extraordinary proposals were made on both sides 
with a view to eliminating nuclear weapons entirely and reducing conven- 
tional forces. Once more, SDI killed the agreement at the last moment. Gor- 
bachev’s advisors (who were as bewildered as their American counterparts 
by these proposals) told him that Congress would kill SDI for him anyway. 
He did not follow their advice, but they were right: Congress cut the SDI 
budget by one third and prohibited tests in space in December 1987. In the 
meantime, a Livermore friend of Teller’s had found a new miracle weapon, 
Brilliant Pebbles : space stations firing thousands of sophisticated projec- 
tiles, full of electronics, which would collide with Soviet warheads. A third 
Reagan-Gorbachev meeting in Washington a few weeks later led to the end 
of Euromissiles. 
The Cold War died in 1990 and with it the Soviet Union and SDI; a few 
years later, the French Riviera was invaded by a new brand of Bolsheviks: 
oligarchs. The life expectancy of ordinary Russians began to decline. The 
European Union eastern boundaries (and with it those of NATO, a clever way 
of assuaging nationalist feelings in Russia) are now the pre-1939 boundaries 
of the former USSR. Last but not least, it has been “proved” that socialism 
is a dead end (especially if confronted with savage aggression followed by a 
ruinous fifty-year arms race led by a far more powerful opponent). 

When asked why SDI did not work, Teller recently replied with a shrug: 
because the technology was not ready. The X-ray laser had cost 2.2 BD, and 
Star Wars a total of 30 BD. America is now spending a mere ten billion a 
year to develop anti-missile weapons against lesser threats than the Soviet 
arsenal, while Livermore (as well as the French Atomic Energy Commission) 
is trying to achieve, among other projects, controlled nuclear fusion of hydro- 
gen isotopes by means of convergent laser beams in the hope, going back to 
1950, of transforming nuclear fusion into an inexhaustible source of energy, 
as was done much earlier with nuclear fission. This also allows weapons de- 
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signers to gain a deeper knowledge of fusion processes so as to improve their 
computer programs. 


Main references: William J. Broad, Star Warriors. The Weaponry 
of Space: Reagan’s Young Scientists (Simon and Schuster, 1985 or 
Faber and Faber, 1986), Goodchild, Edward Teller , Martin Walker, 
The Cold War (Vintage, 1994), John Prados, The Soviet Estimate. 
US Intelligence Analysis and Soviet Strategic Forces (Princeton UP, 
1986), Stephen I. Schwartz, ed., Atomic Audit. The Costs and Con- 
sequences of US Nuclear Weapons Since 1940 (Brookings, 1998). 


Before having a look at Ken Alibek’s Soviet career in biological weapons 
(BWs) from 1975 to the fall of the Soviet Union, let me sketch their previ- 
ous development. After Pasteur, Koch, Metchnikoff and others had founded 
microbiology, it became possible to produce large amounts of vaccines. It 
also became obvious that, if required, similar techniques could be used to 
cultivate pathogens. That it was not pure theory was shown when the 1925 
Geneva Convention prohibited it. The USA did not sign it, but the USSR 
and Japan did; it seems that USSR began to develop a typhus weapon in 
1928, while Japan installed a very successful secret laboratory and produc- 
tion unit in Manchuria in the 1930s. Britain started to study vaccines after 
1936 and, after the Nazis had advertized their brand of ethics at Warsaw and 
Rotterdam, thought it advisable to develop BWs as a hedge against similar 
German ones (they were not studied seriously until 1943 and came to almost 
nothing). British scientists worked mainly with anthrax, a bacterium which 
is easy to cultivate and store by transforming into spores that stay virulent 
for decades. Conclusive experiments on sheep were done at Gruinard Island, 
off Scotland; it was still contaminated and off limits fifty years later. They 
made anthrax cakes in sufficient quantities to be able to kill a lot of German 
cattle (and some people as well). 

In America, studies on BWs began in 1940, and a National Academy of 
Sciences (NAS) committee was set up a month before Pearl Harbor. Although 
its February 1942 report was inconclusive in the absence of practical tests, it 
recommended studying all possibilities (for defense, of course) including an- 
thrax, botulin toxin, and cholera. The program involved the Chemical War- 
fare Service, the Department of Agriculture for anti-crops weapons, and 28 
universities. Although behind Britain until Pearl Harbor, American industry 
quickly developed a far bigger military potential than Britain, which, in this 
domain as in others (atomic bomb, radar, jet engines, etc.), contributed ex- 
perts and knowledge, including penicillin which was industrialized in America 
during the war. 

A research center was set up at Camp Detrick and, in Vigo, Indiana, 
a factory equipped with twelve 5,000-gallon fermenters could in principle 
produce 500,000 four-pound anthrax bombs a month, or 250,000 filled with 
botulin toxin (lethal dose: one milligram). The Americans also investigated 
brucellosis, a more humane weapon which kills few people, but is highly 
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contagious and makes its victims ill for weeks or months, thus overwhelming 
the enemy’s health system. Weapons for use against Japanese rice crops were 
also developed. But Roosevelt was not very interested in these matters about 
which he was very ill informed, and he never made his position clear one way 
or the other. 

In any case, peace came before this program became operational, and 
Vigo was leased to a private manufacturer of penicillin. In 1945 BWs were 
considered potentially at least as efficient as, and much cheaper than, the 
atomic bomb; and since they don’t destroy real estate, you don’t have to 
compensate the enemy and allies after the victory. But atomic weapons were 
viewed as a sufficient deterrent, performing realistic tests of BWs was im- 
possible, and the new German neurotoxic gases (tabun, sarin, soman) killed 
much faster — in a few minutes — than BWs. So, at first, work on BWs was 
limited to laboratory studies. During the Korean War, the Americans were 
accused of having experimented with BWs; it is now generally believed they 
had not, but the war accelerated the arms race in all domains, including 
BWs. In both the US and the USSR, all kinds of bacteria — anthrax, plague, 
tularemia, yellow fever -, and later viruses, were studied and mass produced. 
From 1947, the Soviets worked on smallpox which, by now eradicated, was 
still killing some 15 million people a year in the world in the 1960s. They built 
huge research centers and production units, some in cities, such as Sverdlovsk. 
The CIA had reason to suspect the worst as U-2 and satellite observations 
showed installations looking very much like the American ones, for instance 
a test range on an island in the Aral Sea. 

During the 1950s, scientists in both countries discovered that instead of 
storing or spreading bacteria as liquid cultures, it was far better to dry and 
deep-freeze them (lyophilization); this kept them dormant for long periods, 
even at room temperature. The result was then milled into an ultra-fine 
powder which, after being carried by the wind over possibly tens of miles, 
became virulent again in people’s lungs. This process worked particularly well 
with anthrax, the pulmonary form of which is normally rare and difficult to 
diagnose and kills 90% of its victims unless they are administered massive 
doses of penicillin very early. 

The “top secret” American programs were actually known to plenty of 
people and, like the use of chemicals to destroy jungles in Vietnam, met with 
opposition from journalists, students, and biologists like Harvard’s Matthew 
Meselson and Joshua Lederberg; the latter, who won a 1958 Nobel prize for 
his discovery of how bacteria can exchange genes in a natural setting, was in 
a good position to know that fast progressing molecular biology can be bent 
to genocide , as he wrote in the Washington Post in 1968. During the Viet- 
nam war, opponents, particularly students, organized public demonstrations 
against Fort Detrick, as well as protests against military- university contracts 
and the National Academy of Sciences’ involvement in recruiting young sci- 
entists for Fort Detrick. For their part, the military were not yet convinced of 
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the usefulness of these weapons; proliferation was too easy and too cheap, and 
terrorist attacks were already being mentioned. Eventually, President Nixon 
unilaterally announced in November 1969 that America would limit herself 
to purely defensive work, and he ordered the destruction of stocks and the 
demilitarization of Fort Detrick, Pine Bluff and other centers; I remember a 
Science headline: Is Fort Detrick really de-tricked ? In 1972, an international 
treaty between the US, the USSR and Britain, later approved by many other 
countries, prohibited the production and possession of biological weapons, 
but not defensive laboratory work; it did not provide for inspections either. 

Before 1972, and although “weaponizing” pathogens required solving diffi- 
cult technical problems, only natural bacteria and viruses were used. In 1972- 
1973, American biologists succeeded in systematically moving a gene from an 
organism to a bacterium in such a way that the modified bacterium would 
replicate itself as usual; their first experiment yielded a variant of the nor- 
mally harmless Escherichia Coli that was resistant to penicillin. Thus genetic 
engineering was born and, with it, the possibility of discovering, by chance or 
on purpose, new pathogens from which no protection was known. But in the 
USSR, molecular biology and Mendelian genetics had been almost destroyed 
by Lysenko in the 1930s, and Soviet scientists were increasingly frustrated at 
the thought of being left behind. According to Alibek, the situation changed 
when a vice president of the Akademia Nauk, Yuri Ovchinnikov, explained 
to the Ministry of Defense and to President Brezhnev that bioengineering 
could lead to new weapons. 

This led to the founding in 1973 of an officially civilian pharmaceutical 
organization, Biopreparat, under the Ministry of Health. Biopreparat’s open 
mission was to develop and produce standard vaccines and antibiotics, but it 
enclosed a supersecret “Enzyme” project whose purpose was to develop and 
produce for intercontinental war genetically altered pathogens, resistant to 
antibiotics and vaccines , an outright violation of the 1972 treaty. It also led, 
as Ovchinnikov hoped, to a reversal of the taboo against genetics and mole- 
cular biology, and to new laboratories depending on the Moscow Academy 
since “purely scientific” work was paramount for “defense” against biologi- 
cal weapons. The timing was perfect: gene splicing had just been discovered, 
and its practical importance would soon be proved in the USA by using engi- 
neered bacteria to produce large amounts of insulin, hormones, etc. Enzyme, 
which was led by military scientists and administrators with KGB men every- 
where, came to employ 32,000 workers, including many of the best biologists, 
epidemiologists, and biochemists, in addition to thousands of people working 
in Army labs. 

Let us now go back to Alibek. Hoping to become a military physician 
who could save soldiers on the battlefields, he studied medicine at a military 
school and became interested in research. In 1973, he was ordered by one 
of his teachers to investigate a very unusual outbreak of tularemia which 
occurred around Stalingrad in 1942 among German troops before spreading 
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to the Soviet army. After reading old documents, Alibek reported that this 
incident looked as though it had been caused intentionally . He was at once 
cut short by his teacher who told him he was only supposed to describe how 
we handled the outbreak , not what had caused it, and strongly advised never 
to mention to anyone else what you just told me. Believe me, you'll be doing 
yourself a favor . The lessons he drew from this episode are worth quoting: 


The moral argument for using any available weapon against an enemy 
threatening us with certain annihilation seemed to me irrefutable. I 
came away from this assignment fascinated by the notion that disease 
could be used as an instrument of war. I began to read everything I 
could find about epidemiology and the biological sciences. 


In 1975, a mysterious and well tailored visitor came to interview him and 
other students; he said he was working for a no less mysterious organization 
attached to the Council of Ministers which has something to do with biological 
defense , a prospect which excited Alibek. He was handed a questionnaire 
and told: Don’t tell your friends or teachers about this conversation. Not even 
your parents . A few weeks later, he learned he was assigned to the Council 
of Ministers of the Soviet Union together with four other students. He was 
overjoyed by the prospect of working in Moscow, but he was actually sent 
to a “post office box” hundreds of miles from Moscow. Like Hagelstein, he 
was impressed by the concrete wall and barbed wire surrounding the place 
and by the armed guards at the entrance. The huge Omutninsk Base where 
he arrived already employed some 10,000 people; it was part of the Enzyme 
project. 

On arrival at Omutninsk, Alibek and his friends were not given any infor- 
mation about their research program. A KGB instructor however informed 
them that although an international treaty banning biological weapons had 
been signed in 1972, it was obviously one more American hoax , which they 
were quite prepared to believe; the Soviet Union therefore had to be ready 
to reply. 

When Alibek began to discover Omutninsk’s true mission — mass pro- 
duction of pathogens and not merely laboratory research -, he tried to get 
another job but was told he could not be spared. He thus remained and, 
after this classic early conscience crisis, adapted to the situation with enough 
success and enthusiasm to become Biopreparat’s deputy director fifteen years 
later. The science and technique were fascinating and the career very reward- 
ing provided you were bright, which he was, and made no big mistakes (such 
as inoculating yourself or being too talkative...). 

The new recruits were trained in the culture of bacteria, the techniques be- 
ing the same whether they are intended for industrial applications, weaponiza- 
tion, or vaccination . This is a difficult art which is first learned on harmless 
bacteria; one then has to learn how to infect lab animals with mildly patho- 
genic agents and conduct autopsies, until one may perhaps be allowed to 
work in “hot zones” with infected animals and where wearing the equivalent 
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of a space suit is compulsory: half a dozen Ebola viruses will kill you in a 
month by destroying your blood vessels. A very competent colleague of Al- 
ibek once made a false move while inoculating an animal; after his death, 
they noticed the viruses in his body were particularly virulent, and there- 
fore they weaponized this “Ustinov strain”. One also has to learn industrial 
production processes. 

Smallpox was modified to render all known vaccines useless. Diphtheria 
was grafted on plague. Sergei Popov, a bright colleague, improved Legionnella 
with fragments of myelin DNA to trigger metabolic reactions that devastate 
the brain and nervous system. The invention of a form of tularemia resis- 
tant to three of the main antibiotics, as well as studies on Ebola-like viruses 
took years of work. All in all, little produced by the genetic engineering pro- 
grams was turned into weapons before the Soviet Union collapsed, according 
to Popov who has been living in the USA since 1992; Alibek also remains 
somewhat skeptical, though more pessimistic. 

Incidents happened during this period. In April 1979, about sixty people 
died within a few weeks in the city of Sverdlovsk, an extremely unusual 
event. There was a Biopreparat branch located in the city, working round 
the clock on anthrax. A Russian magazine in West Germany broke the news 
of the outbreak in November, from which US intelligence agents again drew 
conclusions, despite claims that the deaths were due to contaminated meat. 
It is now known that a clogged air filter had been removed but not replaced 
for several hours... 

In October 1989, Vladimir Pasechnik, a very bright scientist at the head 
of a civilian institute in Leningrad, went to France at the invitation of a 
pharmaceutical equipment manufacturer, and never came back. Since his 
institute had worked very efficiently for Biopreparat, he knew quite a lot. He 
was brought to Britain and debriefed. 

Pasechnik’s defection had serious consequences. In a memo to Gorbachev, 
KGB chairman Vladimir Kryuchkov recommended the liquidation of our bi- 
ological weapons production lines , a stunning move which Alibek approved 
since, after all, so long as we had the strains in our vaults, we were only three 
to four months away from full capacity . Although many powerful people 
disapproved of Kryuchkov’s initiative, Gorbachev issued a few weeks later a 
secret decree, prepared by Alibek and another fellow, ordering Biopreparat 
to cease to function as an offensive warfare agency ; but in transmitting 
Alibek’s text to the Kremlin, his chief added a paragraph instructing the 
organization to keep all of its facilities prepared for further manufacture and 
development , which resurrected Biopreparat as a war organization, as Alibek 
says. He was furious but this, at any rate, allowed him to order an end to 
military development at some of the most important installations. 

A second consequence was an agreement between the USA, UK and the 
USSR to organize inspections of suspected BW facilities. The first inspection 
of a few Biopreparat installations took place in January 1991; Alibek and 
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the Russian side were very successful in showing as little as possible, but the 
visitors, who were aware of Pasechnik’s disclosures, were not fooled. 

In December 1991, during the week the Soviet Union collapsed, a visit 
to four American installations chosen by the Russians took place; they were 
known to anybody who had read Science magazine around 1970 (as I did). 
The Russian team included Alibek who could verify that these installations 
were in a dilapidated condition that precluded military work, or had been 
converted to medical research — work on the rejection of organ transplants 
fascinated the Russians -, or, in one case, had never done any military re- 
search. The Soviet delegation nevertheless reported to the contrary, and this 
convinced Alibek that official justifications for his work had been a KGB 
hoax rather than an American one. 

He resigned from the Army, then from Biopreparat, got a job at once 
in a bank — I had no aptitude for finance, but I was soon making deals like 
everyone else -, and went on business trips abroad. His telephone was tapped, 
police watched him around Moscow, and some associates warned him that he 
had better not leave Russia for good and that in any case his family would 
never get permission to leave. In the meantime, a Yeltsin decree banned all 
offensive research and cut defense funding. 

Alibek then went back to his native Kazakhstan, a newly independent 
country where a huge Biopreparat production center had been built years 
before. Local officials asked him to head a “medical-biological directorate” 
obviously intended for weapons research. He flatly rejected the offer, thus 
burning his bridges to both Russia and Kazakhstan, he tells us. Since he could 
still travel abroad for business, he was able to get in touch with Americans 
who were highly interested in his past and, with the help of a few Russians, 
managed to get himself and his family out in circumstances he obviously does 
not disclose. 

While being debriefed in Washington, Alibek struck a friendship with his 
American counterpart, Bill Patrick, who had been at Fort Detrick for forty 
years and was then its chief scientist. Comparing the nature and timing of 
American and Soviet programs since the war, they came to the conclusion 
that at least one disciple of Klaus Fuchs must have been near the top of 
the US organization. After being kept under wraps for several years, Alibek 
went public and told his story in Biohazard (Delta Books, 1999). He is now 
the president of a new company, Advanced Biosystems, working on defense 
against biological weapons and employing, among other people, ex-Soviet 
scientists, e.g. Popov. And a good deal of cooperation with the US is helping 
former weaponeers in Russia to convert to peaceful research and to survive the 
rise in Lenin’s country of the Robber Barons’ variant of American capitalism. 

Pyromaniacs, let us hope, are thus being transformed into firemen; a clas- 
sic process. Nevertheless, the work is going on everywhere now, not only for 
“defensive” purposes in military laboratories, but also and mainly in perfectly 
harmless civilian labs by scientists who publish their findings in standard 
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journals. Although many biologists have tried for decades to devise “ethi- 
cal rules”, knowledge is spreading, the techniques are becoming increasingly 
easier to learn, and weapons of mass destruction are now threatening their 
initiators in this domain, as atomic and chemical weapons did long ago. 


References: Ken Alibek with Stephen Handelman, Biohazard (Delta 
Books edition, 2000), Judith Miller, Stephen Engelberg, and William 
Broad, Germs. The Ultimate Weapon (Simon & Schuster, 2001), 
Robert Harris and Jeremy Paxman, A Higher Form of Killing. The 
Secret Story of Gas and Germ Warfare (Granada Publishing edition, 
1983). The potential of some of these weapons can be judged from 
Richard Preston’s (real life) thriller, The Hot Zone (Random House, 
1994, or Anchor, 1995). 


The adventures of these weapons designers are, of course, extreme cases; 
I relate them here because extreme cases are extremely clear. In normal 
practice, a scientist and particularly a mathematician can only bring a small 
contribution to a complex weapons system. This does not raise such enormous 
and visible ethical problems as the development of H-bombs or biological 
weapons. But it only makes it easier for confusionists, mystifiers or corruptors 
to neutralize your objections. 

More simply, one may be asked to solve a limited problem without be- 
ing told of its military end. Although headed by the Department of Defense 
(DoD) Advanced Research Projects Agency (ARPA or DARPA), the Internet 
project — more accurately Arpanet, its predecessor — was to a large extent 
developed in a few university centers by many graduate students who were 
fascinated by it; many innovations are due to them. Contract holders (“Prin- 
cipal Investigators” ) had, of course, to provide ARPA with (sometimes vague 
or long term) military justifications, and some of the top people went from 
ARPA to universities or back. But, as Janet Abbate tells us in Inventing the 
Internet , 


although Principal Investigators at universities acted as buffers be- 
tween their graduate students and the Department of Defense, thus 
allowing students to focus on the research without necessarily hav- 
ing to confront its military implications, this only disguised and did 
not negate the fact that military imperatives drove the research (...) 
During the period during which the Arpanet was built, computer 
scientists perceived ARPA as able to provide research funding with 
few strings attached, and this perception made them more willing 
to participate in ARPA projects. The ARPA managers’ skill at con- 
structing an acceptable image of the ARPANET and similar projects 
for Congress ensured a continuation of liberal funding for the project 
and minimized outside scrutiny. 


Military secrecy can only lead to similar situations. 
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That said, not everyone was fooled or seduced, as the case of Pierre Cartier 
shows. While a student at the Ecole normale supérieure in Paris around 
1950, he was attracted by both mathematics and physics without at first 
being able to choose. He once told Yves Rocard — a physicist with strong 
industrial and military connections, who headed the physics lab at the school 
— that he wanted to work for a doctorate. Rocard then handed him a thick 
bundle of photographs; Cartier understood at once that these were a series 
of very close steps in an atomic explosion. Rocard proposed that he find 
a way of computing its power from these pictures, for instance from the 
propagation of the shock wave, or something similar. Cartier did not like the 
idea, still less Rocard’s conditions: Rocard would help Cartier to get a good 
university position, but his thesis would remain secret, and he would have 
to sever his relations with his Communist friends, as well as with Rocard’s 
son Michel, who was embarking on a political career (he became a Socialist 
Prime Minister thirty years later) and, at the time, had rather leftist opinions 
which were out of phase with Rocard’s. 

This decided Cartier to choose mathematics. He soon became a Bour- 
baki member and one of the best French mathematicians of his generation, 
still with a taste for mathematical physics, though not Rocard’s brand. Of 
course, one can explain Cartier’s reaction by the fact that, beside having 
strong religious beliefs, he was exposed to a much wider spectrum of political 
and philosophical opinions at the Ecole normale — where there are as many 
students in humanities as in science, all living together — than at Livermore 
or at a Soviet military school of medicine. Still, not everyone reacted the way 
he did. Thousands of scientists (and many more engineers) worked, and are 
still working, on military projects with no qualms. 


§ 2. The evolution of R&D funding in America 


All scientists of my generation know, if only vaguely and without proclaim- 
ing it too loudly, that WW II and the Cold War did wonderful things (LI. 
Rabi) for science and technology; Rabi spent his whole career at Columbia 
University from 1928 to his death, was already a physics star by WW II, 
later a Nobel Prize winner, and a top government advisor for decades. I have 
sometimes been told by colleagues that a statement as “obvious” as Rabi’s 
requires no proof, cafeteria gossip presumably being enough. If this is the 
case, then professional historians of science and technology might as well 
retire. 

In this section, I’ll first summarize the evolution of R&D in the USA 
since the war, since this country has clearly been the leader and even the 
model for half a century; Britain and France, as well as the Soviet Union, 
have always tried to follow America and to adopt its priorities, more or less, 
with differing results. R&D, for “Research and Development”, means basic 
research (without any practical purpose in sight), applied research (with a 
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more or less well defined practical purpose), and development, during which 
scientific results are used to design prototypes ready for production. These 
distinctions are not always very definite, and development usually requires 
solving many engineering problems, sometimes unexpected scientific ones, as 
well as extensive (and expensive) tests. Roughly speaking, basic and applied 
research cost 10 to 15% of R&D budgets each and development requires some 
70% of it, but the proportions very much depend on the field. 

The roughest measure of a country’s R&D activities consists in compar- 
ing their total cost to the Gross National Product (GNP). In the USA, the 
proportion increased from 0.2% in 1930 and 0.3% in 1940 to 0.7% in 1945, 
1.0% in 1950, 1.6% in 1954, 2.4% in 1958, and to a peak of 3.0% in 1964; at 
that time, US funds represented about 60% of all that was spent on R&D 
in OECD countries (North America, Western Europe, Japan, etc). As many 
articles, reports on “technological gaps”, and books attested at the time, all 
other countries, and especially de Gaulle’s France, looked at this 3% figure 
with an awe bordering on the mystical; someone joked that the optimal rate 
might be 3.14159...%. Since, moreover, the US GNP had climbed, in constant 
currency, from 100 BD in 1940 to about 300 MD in 1964, you can see that 
in this decisive quarter of a century, R&D expenses multiplied by ten in pro- 
portion with the GNP and by thirty in constant dollars! Such a miraculous 
growth rate could not, of course, be sustained: the R&D/GNP ratio began 
to fall as soon as it reached 3%, went down to 2.2% in 1978 and wavered be- 
tween 2.6 and 2.8% between 1983 and 2000. The current and very optimistic 
goal of the European Community is to reach 3% by 2010. 

In America as everywhere else, the two main sources of R&D funds are 
the Federal Government and private industry. Universities and not-for-profit 
private organizations also contribute, but on a much smaller scale, though 
their contributions to basic research may be important in some sectors. For 
instance, after having made a huge fortune at Hollywood, on the TWA air- 
line, in buying hotels and casinos in Las Vegas and in selling planes to the 
Pentagon, Howard Hughes, like John D. Rockefeller long before him, set up 
a foundation whose trustees manage his little hoard, by now worth some 11 
billion; the dividends support selected projects in medical research, by far 
the most popular field in America for a long time. 

The relative importance of these two main sources of R&D funding has 
changed considerably since 1940. This is basically due to the nearly linear or 
weakly exponential growth of private industrial funds, while the fluctuations 
in federal funding were much larger, as will be shown. 

In 1940, the figures (in current MD) for national total and for federal and 
industrial contributions were 345, 67 and 234, respectively. In 1945, they were 
1520, 1070 and 430, respectively. In 1950, they were 2870, 1610 and 1180. 
Although data for these years are not entirely reliable, the trend is clear. 
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For each year between 1953 and 2000, data in constant (1996) MD are 
available in Science and Engineering Indicators 2002 , an NSF publication 
easily available at nsf.gov/srs. It provides some significant figures: 


Total Federal Industry Universities Nonprofit 


1953 26805 14455 11670 190 286 
1958 50439 = 32228 17130 256 492 
1966 90236 57910 29971 673 1028 
1975 89112 46289 39531 1078 1335 
1982 122034 56200 61422 1821 1653 
1987 162798 75468 80660 2916 2383 
1994 176246 63316 103326 4100 3816 
2000 247519 65127 169339 5583 5415 


From less than 20% in 1940, federal contributions to the total R&D reached 
almost 62% in 1966, stayed over 50% until 1975, remained at 46% during 
the Reagan years (1980-1988) in spite of a sharp increase in federal (actually, 
military) funds, then decreased to 26% in 2000. It is only since 1980 that 
industry has been spending more than Washington. To a large extent, the 
proverbial “innovative capacity” of US private enterprise has been propelled 
by federal dollars for almost 40 years, and mainly by defense as shown below. 

All federal agencies contribute to the funding of R&D. The Department 
of Defense (DoD) has been the most significant since 1941, followed by the 
Department of Energy (DoE, founded at the beginning of the 1970s, deal- 
ing with all kinds of energy, including the former Atomic Energy Commis- 
sion, AEC, founded in 1946), NASA (or NACA, aeronautics, until 1958), the 
National Institutes of Health (NIH), and the National Science Foundation 
(NSF). Other federal departments together account for no more than 6% of 
the federal total, although their role, here too, is substantial in some fields. 
NSF annual statistics (Federal Funds for Research and Development ) provide 
a good, if probably not 100% accurate, view of their evolution. 

In 1940, the government allocated 26 MD (current money) to defense 
R&D, 29 to agriculture and some to geology and mining; there was also a 
National Bureau of Standards which had been created in 1901 on the model 
of a German laboratory where much important research was conducted to de- 
termine accurate values for physical constants, weights, measures, etc. During 
WW I, the Washington Academy had created a National Research Council 
which did a lot of military research and was officially recognized after the 
war, but it got most of its small budget from private sources and spent 
it mostly on fellowships for young scientists. Otherwise, practically nothing 
went to research proper except for the creation in 1937 of a National Cancer 
Institute. 

The picture had changed by 1945. Out of the 1590 MD in federal funds for 
R&D, agriculture still got 34, defense (atomic excluded) 513, the Manhattan 
Project (atomic) 859, and 114 went to the Office of Scientific Research and 
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Development (OSRD) created during the war to organize military research 
in all sectors. Not surprisingly, defense justified 90% of the total. During the 
war, industry spent less of its own funds on R&D than in 1940 in constant 
dollars, but, of course, received a flood of military contracts. Many univer- 
sities received undreamed of amounts of money for military research: MIT 
117 MD, CalTech 83 MD, Harvard 31 MD, Columbia 28 MD, to name but 
a few; new off-campus installations had to be set up for the most expensive 
projects. In 1950, out of 1083 MD in federal funds, agriculture got 53, DoD 
652, AEC (essentially military at the time) 221, and NACA (similarly) 54 
instead of 2 in 1940. Although Truman had considerably “restricted” the to- 
tal defense budget after 1945 (13-14 BD until 1950, as against one in 1940), 
it remained large enough to finance a few large-scale technological projects, 
such as the development of the big jet bombers (B-47 and B-52) and super- 
sonic jet fighters, progress in rockets and missiles, and the beginning of the 
development of nuclear submarines. The contributions of the main agencies 
are as follows for selected subsequent years, in current money: 


Total DoD AEC/DoE NASA _ NIH NSF 


1953 1851 1275 278 84 59 0.151 
1958 A774 3480 828 97 218 41 
1966 16178 7099 1441 5327 1142 323 
1975 19859 9179 2439 3207 2436 618 
1982 37822 16786 5896 3708 3950 976 
1987 57099 35708 5529 4096 6643 1531 
1994 69450 34818 6959 8811 11141 2212 
2000 77356 33215 6873 9754 18645 2942 


These figures show the relative importance of the main federal sources of 
R&D money. DoD’s contribution has always been, by far, the most important 
one, but to gauge the real size of defense-related funds, one should also take 
into account the AEC/DoE budget. In 1968, for example, out of a total of 
about 1600 MD, AEC’s R&D budget included 400 for research proper (48 
for weapons, 265 for physics, 86 for biology and medicine); 425 went to the 
development of weapons, 491 to the development of nuclear reactors, much 
of it for the Navy and Space, and 224 to construction work. It may also be 
assumed that NASA’s R&D was not totally disconnected from defense even 
though the DoD itself spent between 500 and 1100 MD yearly on R&D for 
military astronautics between 1961 and 1965, and between 2 and 3 billion 
for the development of missiles. It may also be assumed that the CIA and 
the National Security Agency (NSA, cryptology, reconnaissance satellites, 
etc.), whose contributions are not reported, had sizable amounts to spend 
on R&D. And although much R&D for military industrial projects was to a 
large extent financed by the government even prior to any production, still 
some of it was private money. 
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On the other hand, the prospect of a federal budget surplus under Clinton 
prompted Congress to adopt a bill in 1998 to double the non-defense part of 
the federal R&D budget over ten years. This target was reached for the NIH 
by 2003, at least in current dollars, to the displeasure of specialists in other 
domains left behind. 

The above table shows a substantial decrease of DoD funds after the 
Reagan years, but the trend was later reversed, courtesy of Mr Ben Laden. 
According to a recent analysis by the American Association for the Advance- 
ment of Science (www.aaas.org/spp/rd), out of the projected federal budget 
for R&D in the year 2005, the defense-related part, including 4.5 BD from 
the DoE, should amount to well over 74 billion, and the non-defense portion 
to over 57, of which NIH will get almost 30, Space over 10 and NSF 3.8. A 
new domain, antiterrorism R&D, will absorb 3 BD, of which 1.7 will go to 
NIH to fight bioterrorism, e.g. anthrax pocket weapons which are seen as a 
serious threat. Although the 2004 budget is the biggest ever since 1945, even 
in constant dollars, and far bigger than any other country’s, America is able 
to afford it by devoting less than 4% of her GNP to total defense, as against 
at least 12% at the height of the Cold War. This is because GNP has grown 
at least five times in constant dollars since 1945. 

The tables above make it possible to estimate the percentage of Defense 
money over total R&D, by converting current dollars into 1996 dollars. In 
1958, defense-related federal funds for R&D accounted for 82% of all federal 
funds and 53.1% of national R&D expenses, hence more than industry’s own 
contribution. In 1987 defense still accounted for almost one third of total 
R&D and 68% of federal R&D; it later decreased to a low of 13.6% in 2000 
because of the growth of industry’s own funding; Microsoft for instance is 
currently spending about 5 BD a year on R&D and presumably does not use 
the Pentagon’s money to develop Windows, which may explain its quality... 
R&D is mostly development, but the importance of development in Defense 
is particularly striking: 2.9 BD out of 3.5 BD in 1958 and 28 BD out of 33 
BD in 2000, with similar proportions in the interval. Industrial firms always 
get at least 60% of the DoD funds for R&D, while about 30% of the money 
is spent in DoD’s own technical centers. According to the AAAS, only 5.18 
BD should go to basic and applied research in 2005. 

Some federal funds go to so-called Federally Funded Research and Devel- 
opment Centers (FFRDC). These were organized during or after WW II and 
are administered by industrial firms, universities, or nonprofit institutions. 
The first category includes huge centers such as Idaho, Oak Ridge, Sandia 
and Savanna River producing nuclear material or weapons, though on a very 
reduced scale now. The second includes the MIT Lincoln Lab (electronics, 
radar, SAGE, anti- missiles, etc.), the Jet Propulsion Lab (Cal Tech), Ar- 
gonne (Chicago U.), Brookhaven (several universities) and huge installations 
for particle physics at Berkeley, Princeton, Stanford, etc.; last but not least, it 
also includes Los Alamos and Livermore labs initially founded for the devel- 
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opment of nuclear weapons and administered by UC Berkeley, which didn’t 
always relish it although it earned money from it. In the third category, there 
is the Rand Corporation which was organized in 1946 by Douglas aeronau- 
tics and the Air Force and soon became a research center financed by the 
Pentagon; it became famous in the 1950s for its development of operational 
research, game theory and mathematical programming, and for its slightly 
pathological strategic studies, particularly when Herman Kahn, in Thinking 
the Unthinkable and other books, made them popular by explaining nuclear 
war “escalation” theory (up to what he called a “nuclear spasm” or, as some 
said, “orgasm”) as if it were a very funny poker game. 

These cold figures should be supplemented with some more concrete infor- 
mation. As mentioned above, academic research got very little from Wash- 
ington before the war; it was financed by university funds, philanthropic 
organizations and, in many engineering departments, by industry, enough to 
increase significantly the number of scientists during the inter-war period. 
The Rockefeller Foundation, which up to 1932 spent 19 million on academic 
research, spent a lot more on medicine than Washington. It also financed 
physics during the 1920s: thanks to its fellowships, many scientists, including 
future American designers of atomic bombs, learned their trade in Europe; 
European physicists were invited to America, some permanently; and the 
Foundation financed new laboratories in Copenhagen and Gottingen as well 
as the Poincaré, Institute in Paris. By 1930, and like many social scientists, 
it was having doubts over the value of physical sciences and technology: gas 
warfare in WWI had been rather bad publicity, as had the disruption of 
the American way of life and traditional values by technological advances. 
It therefore decided in 1932 to concentrate on applications of physics and 
chemistry to biology, which made it a prime sponsor for many of the future 
creators of molecular biology. Ernest Lawrence, and he alone, succeeded in 
attracting big money for his Berkeley cyclotrons: as much as one million 
in 1940 — a staggering sum at the time for physics — from the Foundation 
which betted on the prospect of cheap artificial radio-elements to fight can- 
cer; otherwise, almost all of his money came from other philanthropists and 
the university. America had a good number of first class physicists by the 
1930s; three dozen generally small particle accelerators were built in uni- 
versities (Germany had none in 1940, France had one). In these depression 
years, particularly 1932-1934, attempts to get federal money were unsuccess- 
ful — almost all the New Deal relief money went to jobless people. Although 
senior scientists were generally comfortable, many younger ones were badly 
paid, and some unpaid ones spent part of their time making money to sur- 
vive while continuing laboratory work. It is remarkable that the production 
of PhD’s between 1930 and 1939, namely 980 in mathematics and 1924 in 
physics, was almost triple that in the preceding decade; this was mainly due 
to the strong growth of higher education in all domains. Without federal help 
to speak of, America was thus already the new dominant country in physics. 
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There were Jewish refugees in all intellectual domains after Hitler’s seizure 
of power; though they were generally much younger, less well known than 
Albert Einstein, and not always welcome as Jews at the time, many Ameri- 
can scientists helped them. After having a hard time until the war, most of 
the refugee scientists — almost 200 in mathematics and physics — were to find 
permanent university positions after 1945, and several dozen became leading 
scientists, or even stars. This also contributed to America’s standing in these 
two domains, as in many others. 

MIT, where many top American industrialists and engineers had been 
educated since the 1880s, already had the biggest electrical engineering de- 
partment in the world, thanks to industrial contracts, gifts from alumni, and 
tuition fees. Private industry spent about 250 MD on R&D in 1940, partly 
in laboratories created fifteen or thirty years earlier by big companies like 
General Electric, AT&T, Westinghouse, or DuPont; they started doing some 
basic research in the 1920s. In 1925, AT&T, the private telephone monopoly, 
founded its Bell Labs, which soon became the largest industrial research lab- 
oratory in the world, with a 20 MD budget and some 2,000 employees by 
1940; a physicist there won a Nobel prize for experiments on electron diffrac- 
tion which confirmed the dual nature of elementary particles. Another Nobel 
Prize went to General Electric’s physical chemist Irving Langmuir (who had 
its first success in 1913 in discovering that filling incandescent lamps with 
nitrogen greatly increased their life). At DuPont, a basic research program 
on polymers began in 1927, with initial funding of 250,000 dollars (to be 
compared with Columbia University physics department’s budget of 15,000 
dollars in 1939); from there came nylon in 1938, for the development of nylon 
in 1938; it cost about 2 MD and generated a 600 MD business twenty years 
later. There was also much R&D in the petroleum industry, with projects 
costing from a few hundred thousand to 15 MD. This figure looked enormous 
at the time. 
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As previously mentioned, the war changed the picture. At MIT, a Radiation 
Lab was founded in order to develop radar; scientists of all levels worked 
there, including Hans Bethe (until 1943), Isidor I. Rabi and Lee A. DuBridge 
who headed the lab; Louis Alvarez and other young collaborators of Lawrence 
brought the expertise in electronics and high frequencies they had acquired 


Postface 415 


in Berkeley; many of these people became very influential science advisors 
to the government after the war. At MIT and elsewhere, the work on radar 
required many advances in all domains of electronics, e.g. in high frequencies, 
or in semi-conductors because glass valves could not detect centimetric radar 
waves. Methods for purifying germanium were found at Purdue and were 
crucial to the invention of transistors a few years later, while Bell Labs did 
the same, with less success, for silicon. The size of the radar business can be 
gauged from the fact that the Rad Lab employed up to 4,000 people, while 
the industrial production proper cost almost 3 BD — more than the atomic 
bomb project. 

Headed by General Groves, the Manhattan Project — that most spectac- 
ular success story, though less useful for winning the war — employed hun- 
dreds of scientists in Los Alamos and elsewhere; these included Fermi, Bethe, 
James Franck, Harold Urey, Arthur Compton, Lawrence, von Neumann, Al- 
varez, and even Niels Bohr, all of them (except von Neumann) past or future 
Nobel prizewin ners. Oppenheimer, a former Rockefeller fellow and the best 
native theoreti cian, headed Los Alamos with fantastic brio; he understood 
everything and made the whole enterprise succeed. He was under permanent 
surveillance by the FBI who were well aware of his pre-war leftist leanings 
and connections; this did not prevent the bombs’ blueprints from quietly 
leaving Los Alamos for Moscow in a Plymouth driven by Klaus Fuchs in the 
summer of 1945. The project cost two billion, 70% of which was spent on the 
production elsewhere of U-235 in a gigantic isotopic separation factory or in 
Lawrence’s calutrons, and of Pu in huge atomic piles. Most of the basic tech- 
niques later used in civilian nuclear energy were invented between 1942 and 
1945, and this allowed General Electric, Westinghouse, DuPont and other 
companies to learn them and to become world leaders after the war in using 
nuclear power for electricity production, and first of all for the propulsion of 
submarines or aircraft carriers. More about this in the Internet file. 

In 1945-1946, nuclear physicists were rewarded with millions left over from 
the Manhattan Project, which allowed them, among other consequences, to 
build new particle accelerators whose cost eventually came to billions (not 
millions). Before 1940, this prospect would have been dismissed as utterly 
insane. The AEC/DoE has funded this domain in America from 1947 to this 
day, while the Rockefeller Foundation withdrew its support after 1945 since 
the government could provide far more; in addition, since 1941 Lawrence 
and others had been hinting at spreading radioactive waste over or in front 
of enemy troops in case of war, which was not quite as glamorous as fighting 
cancer. 

In a famous 1945 report, Science, the Endless Frontier , the chief of mili- 
tary R&D (OSRD) during the war, Vannevar Bush, advocated the establish- 
ment of a National Science Foundation funded by the government and whose 
president and programs would be chosen by scientists; the project was re- 
jected by the President. It came into being in 1950 as a federal agency funded 
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and governed by the government and controlled by Congress like other agen- 
cies, with, of course, plenty of scientific advisory committees; but it got very 
little money before Sputnik, as the table above shows. In the bio-medical 
sector, where a first National Cancer Institute had been founded in 1937, 
new National Institutes of Health were established; with strong backing from 
Congress and voters, they continued to grow and multiply and are now by 
far the most important non- defense source of federal money. Meanwhile, the 
Office of Naval Research founded in 1946 spent some 20 MD per year to help 
research in all domains, mainly to keep in touch -“in case” — with scientists 
and research; mathematics got about 10%, but a threateningly increasing 
part of it (up to 80% in 1950) was — already! — funding the development at 
MIT of a futuristic Whirlwind computer working in real time; a riot ensued, 
and Whirlwind would have died but for the birth of a far better sponsor in 
1950, namely the air-defense system of the American continent, as we shall 
see later. 

Private universities, where government interference was anathema before 
1940, reversed their principles: ONR was very liberal and people got used to 
this new kind of “tainted money”; after all, nobody had ever asked trustees or 
benefactors of the rich universities how they became so wealthy; but it some- 
times took several years before federal money (and possibly classified military 
contracts) were accepted. CalTech was still a small university in 1945; with 
a board of trustees made up of very conservative bankers and industrialists 
who approved the policy of basic research presided over by physicist and No- 
bel Prize winner Robert Millikan, it was several years before it bowed to the 
inevitable; meanwhile, the off-campus Jet Propulsion Laboratory founded by 
von Karman prospered on guided missiles and DoD money, as was the case 
at Johns Hopkins with the Applied Physics Laboratory founded during the 
war. Julius Stratton, a future president of MIT who during the war had close 
ties with the higher echelons of the Pentagon — he was one of the stars of the 
MIT Radiation Lab -, wrote in October 1944 to MIT president: 


Twenty-five years ago everyone talked about the end of war; today 
we talk about World War III, and the Navy and Air Force, at least, 
are making serious plans to prepare for it. Inevitably this national 
spirit will react upon the policies of our educational and research 
institutions. It always has, and we might just as well face it (...) 
We shall have to deal with the Army and Navy and make certain 
concessions in order to meet their needs. 


This means that by 1950, 85% of the MIT total research budget came from 
the military and AEC, with a still higher proportion for physical sciences in 
other elite universities. John Terman, another star in electronics, wrote in 
1947 to his university’s president that 
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Government-sponsored research presents Stanford, and our School 
of Engineering, with a wonderful opportunity if we are prepared to 
exploit it, 


which of course they were. The importance to the military of these univer- 
sity departments was due not only to their research work, but also to their 
educating thousands of scientists and engineers for defense work in particular. 
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Various more or less successful attempts were made after 1950 to bring sci- 
entific advice to the highest levels of government, particularly the DoD; it 
was Sputnik which brought scientists to the White House. Meanwhile, the 
Korean War was an opportunity to organize “summer studies” during which 
scientists, engineers and military men would gather for several weeks in or- 
der to study such (classified, i.e. secret) defense problems as anti-submarine 
warfare, tactical nuclear weapons, air defense, etc. 

The size of American defense activities in the 1950s and 1960s can easily 
be explained by political factors and by reactions to perceived Soviet threats 
(or counter-threats to perceived American threats: bombs, bombers to deliver 
them, and the “encirclement” of the USSR by US air bases). As we have seen, 
the first Soviet atomic test launched America into the race for the H-bomb. 
In the spring of 1950, the celebrated NSC-68 report of the White House 
National Security Council, vastly exaggerating the Soviet military threat and 
supposed plans for world domination, recommended (among other things, e.g. 
much stronger West European forces) a huge increase of the Defense budget; 
the figures which were known but remained unwritten, namely 40- 50 billion 
instead of 13-14, were judged excessive even by the military, who did not 
know how to spend so much money. Truman did not agree either, but the 
“Socialist camp” forced it on by sparking the Korean War. In particular, the 
production capacity for U-235 and Pu was increased in a staggering way: five 
new piles for the production of Pu, one for the production of tritium, and two 
more huge isotopic separation units, with sixteen times the capacity of the 
1945 factory, which had already been enlarged; up to 85 tons of U-235 could 
be produced per year, which needed 6,000 megawatts of electricity, or 12% of 
total US production. Nuclear weapons of all types grew in America at a rate 
of several thousand per year, to reach 32,000 in 1964, with powers ranging 
from a few tens of tons up to several megatons in TNT equivalence. This 
was about fifteen times the Soviet arsenal at the time and could be delivered 
by 800 intercontinental ballistic missiles (ICBM), 200 submarine-launched 
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missiles (SLBM), a thousand fighter- bombers based in Europe, the Middle 
East, Japan or on aircraft carriers, and strategic bombers (about 2,000 B-47 
and 700 B-52 were built before 1962). 

A gigantic system to defend America against Soviet bombers was built, 
as we are now going to see. The first Soviet atomic bomb led people at MIT 
to take the first steps to protect the USA from future Soviet bombers in 1950 
(this threat was dismissed by Curtis LeMay, the Strategic Air Command 
(SAC) chief during the 1950s: his personal strategy was to wipe out Soviet 
planes, copies of the US B-29 bombers of 1944 vintage, before they could take 
off, but bypassing the President was slightly illegal...). This originally small 
Project Lincoln based on the Whirlwind computer led to the founding of a 
Lincoln Lab at MIT, and to the gigantic SAGE system of continental defense 
—a precursor to SDI -, at a cost of 30 billion (or 200 billion in 1996 money, 
and much more if personnel and other costs are included). Thousands of Bell 
Labs Nike-Hercules missiles, each carrying a 2 to 30 KT atomic warhead, 
could destroy entire fleets of incoming aircraft , assuming the Soviets were 
clever enough and able to send such fleets over the North Pole in suicide 
raids since, in any case, they could not make the round trip until big jets — 
never more than 200 — began to appear in 1955. Bell Labs, which had de- 
veloped anti-aircraft rockets since 1945, managed everything while hundreds 
of subcontractors in practically all domains of technology helped develop the 
hardware and software needed in SAGE. SAGE was obsolete as soon as it 
became operational in 1960- 1962: bombers were replaced by unstoppable 
missiles after 1962, which led to the first and useless anti-missile systems, 
including the highly controversial Nike-Zeus missiles with 60 to 400 KT war- 
heads, based around big cities and never deployed. The USSR’s program 
evolved in similar fashion, but was even more expensive since missiles and 
bombers could come from many directions. 

The SAGE project however played a major role in all kinds of tech- 
nical advances, particularly long-range “over- the-horizon” radars, guided 
anti-aircraft weapons, and computers. In this last field, it led to magnetic 
core memories, video displays, light pens, graphics, simulation, synchronous 
parallel logic, analog-to-digital conversion and transmission of radar data 
over telephone lines via the first transistorized modems made by Bell, multi- 
processing, automatic data exchange between different computers, etc. With 
its hundreds of thousand lines of code and hundreds of computer screens, 
SAGE provided the first opportunity to train several thousand programmers 
(most of whom later went to industry); this was done by the SDC branch 
of the Rand Corporation, which was founded in 1957 to that effect. Among 
many other machines, SAGE needed fifty six IBM AN/FSQ-7 and -8 (or 
“Whirlwind IT”) computers; there were twenty-four SAGE main command 
centers connected to a pharaonic installation under the Colorado mountains, 
itself connected to the White House and Pentagon; each of the centers used 
two of these IBM computers working in tandem to increase reliability. Made 
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to order at a cost, in current money, of 30 million a piece, each of these 
machines weighed 275 tons, had some 60,000 valves, used 32-bit words, had 
a magnetic core memory — one of the great innovations from Whirlwind — 
of about 270 kilobits, twelve magnetic drums each storing 12,288 words of 
program, and was connected to about one hundred screens displaying enemy 
planes’ trajectories and enabling operators to vector fighter planes graphi- 
cally. It needed 750 kw of electric power to run and a hurricane to evacuate 
the heat it generated. These performances may look puny by 2005, but there 
was nothing more powerful at the time and, of course, the new techniques 
were put to good use in IBM’s future commercial computers. All of the lat- 
ter were transistorized after 1960, the first large ones (series 7090) being 
delivered to the three gigantic radars of the Ballistic Missiles Early Warning 
System in Alaska, Greenland and Scotland. 


References on SAGE: chap. 4 of Atomic Audit , Edwards’ chap. 3, 
Kent C. Redmond and Thomas M. Smith, From Whirlwind to Mitre. 
The R&D Story of the SAGE Air Defense Computer (MIT Press, 
2000), very weak on technology, and Thomas P. Hughes, Rescuing 
Prometheus (Random House, 1998). 


Then came Sputnik in October 1957, which scientists used very success- 
fully to clamor for increased research funds. NACA was transformed into 
NASA, with very soon a budget in billions of dollars, while the Defense bud- 
get proper decreased. A scientific committee (PSAC) was instituted at the 
White House. The Advanced Research Projects Agency, ARPA, was founded 
by the DoD in order to fund and organize the most sophisticated research 
projects with military implications. Americans reacted to the “missile gap” 
with wild and shifting predictions on the size of the Soviet arsenal (100 in 
1959, 500 by 1960 and 1,000 by 1961-1962) from the CIA, the Air Force, jour- 
nalists, and democrat politicians, including Kennedy and especially Johnson, 
wishing to destroy the 1952-1960 Eisenhower republican administration. But 
radars from Turkey and Iran had detected Soviet missile tests in 1953-1954, 
and, from 1955 on, absolute priority was given to similar American programs, 
Atlas and Titan, soon followed by the silo protected Minuteman series of 
ICBMs, the Polaris missiles for nuclear submarines, and the first satellites 
for reconnaissance, infrared detection of missile firings, meteorology, commu- 
nications, etc (1959-1961). Extended flights over Soviet territory first by U-2 
spy planes, then satellites, proved in 1960 that there was indeed a big “missile 
gap”: perhaps four Soviet operational missiles, to dozens of American ones. 

Like the Korean War, Sputnik and Khrushchev’s boasts proved to be a 
self-defeating move and another wonderful opportunity for the American and 
Soviet “scientific-military- industrial complexes”. The Soviet arsenal, vastly 
outnumbered by the American arsenal until the 1970s, was nevertheless big 
enough to make an American attack unlikely, and in any case America’s 
top political rulers found the Air Force’s apocalyptic war plans quite repel- 
lent, although they knew they might have to “push the button” as a last 
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resort (see my vol. I, p. 122; in 1960, over 150 weapons were reserved for 
the Moscow area alone, and quite a number of them would have destroyed 
each other). To paraphrase a journalist writing in Science , September 27, 
1974, these huge defense systems were the cathedrals of a century that future 
historians will characterize by its extraordinary technical capacities and its 
permanent devotion to the mortuary arts. And so on, with ups and downs, 
until the fall of the Soviet Union. The most exotic parts of Reagan’s Star 
Wars project were terminated, but a less ambitious anti-missile program is 
still going on, at the rate of several BD per year, with a first deployment in 
Alaska of weapons guided on a collision course with enemy missiles (a fasci- 
nating problem in Control Theory) although no one can guess who would be 
foolish enough to launch them. America’s military doctrine is now undergo- 
ing a “Revolution in Military Affairs” based on “Space Dominance”, which 
aims at fully integrating every weapon and everyone — from the President and 
the Pentagon warlords down to the GI on the battlefield — through all kinds 
of satellites, drones, telecommunications, information networks, etc. You will 
find an impressive survey of it in Introduction au siécle des menaces (Paris, 
Odile Jacob, 2004), by Jacques Blamont, a French specialist in Space Sciences 
with long and strong ties to the Jet Propulsion Lab (and, more recently, So- 
viet astronautics), and a member of the US Academy of Sciences. Another 
“revolution” has been under way since the Strategic Computing Initiative 
of the 1980s: substituting all kinds of “intelligent” robots for weak mortals 
on the battlefields of 2030, according to the New York Times (02/16/2005). 
Contracts worth 127 BD have already been issued for this Future Combat 
Systems project, which will contribute to boosting weapons acquisition costs 
from 78 BD now to 118 by 2010. Those who believed the end of the Cold War 
would slow down the technical progress of armaments were badly mistaken... 

The development of nuclear weapons, fighter planes, bombers, missiles, 
nuclear submarines, aircraft carriers, SAGE, satellites for C4 RI (command, 
control of operations, communications, computers, reconnaissance and intel- 
ligence), etc. relied on and greatly encouraged technical progress in dozens of 
less spectacular domains: electronic components (from glass valves to tran- 
sistors to printed circuits to integrated circuits to VLSI...), computer hard- 
ware and software, navigation and guidance systems, infrared detection, fire 
control devices, radar and sonar, microwave propagation, space telecommu- 
nications, materials, etc. The list is endless. 

The development of transistors and integrated circuits is a good example. 
Semi-conductors had been known for a long time and were the first detectors 
used in wireless in the 1900s. Systematic experimental studies in the 1930s 
and during WW II, as well as the development of a solid-state theory using 
quantum mechanics, had led to a good understanding of the phenomena by 
1945, and, at Purdue university, to methods of obtaining highly purified ger- 
manium (so named by its German discoverer), from which rectifying diodes 
were mass produced for radar detection. The Bell Labs did the same with sili- 
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con with less success at the time. After 1945, they tried to discover solid-state 
amplifiers, and the first very primitive point-contact transistors were made 
there in 1947 by two physicists, John Bardeen and Walter Brattain, headed 
by William Shockley who a few years later found a way to make industrializa- 
tion far easier; all three shared a Nobel Prize. Transistors, patented by Bell 
in 1948, were expected to replace electronic valves and electro-mechanical 
switches in a myriad of devices used by the AT&T telephone system. But 
there was nothing urgent here — the capital invested in standard equipment 
was far too high to be scrapped — and, anyway, replacement would require 
years of further development and industrialization. AT&T, however, was un- 
der an anti-trust suit at the time and the military watched the development 
of transistors with great interest. Bell therefore organized a first informa- 
tion meeting at the beginning of 1951 for military and government officials 
only, then a symposium in September for some three hundred American and 
European engineers to whom the characteristics of a dozen transistors were 
disclosed. In 1952, Bell decided to sell its patents to 36 companies and, in 
April, to divulge the know-how to licensed companies. A first production 
unit for military transistors was built by Western Electric, the manufactur- 
ing branch of AT&T. The anti-trust suit ended in 1956 and, among other 
clauses, AT&T was ordered to limit its production to its own needs and to 
the government market, for which many Bell innovations were made; this 
favored other manufacturers. The Army Signal Corps had already issued 
production contracts to twelve makers for use in the forthcoming strategic 
missiles, and demanded 3,000 units of thirty different types per month. Since 
at that time only 5 to 15% of the production was free of defects, this re- 
quired much higher production capacities, with very high unit costs. But the 
rate of rejects, and hence prices, soon dropped, and sales to less demanding 
buyers went from 14 million in 1956 to 28 million in 1958. The military were 
interested in transistors because they were small and light, consumed very 
little power, and were much less sensitive to shocks, vibrations and wear than 
valves. First models of transistorized computers were built at Bell Labs and 
Lincoln Lab (MIT) in the 1950s, for the military, of course. 

The first civilian commercial uses of transistors were for hearing aids 
(Raytheon, 1954) costing 150-200 dollars; transistor portable radios came a 
few years later. It took at least ten years before a large commercial market 
developed because classical valves were far cheaper — one dollar instead of 
eight around 1953 -, had much better characteristics than early transistors, 
were much easier to make, and were much more familiar to most electronics 
engineers; the main advantages of transistors were not needed in most ap- 
plications, though they attracted the military. Between 1954 and 1956, the 
markets for transistors and valves were $55 and over 1000 million respectively. 
And though several established valve manufacturers (General Electric, RCA, 
etc.) had 31% of the market in 1957, new and much smaller firms (Texas 
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Instruments, formerly a geophysical services company, Transistron, Hughes, 
etc.) had 64%. 

Integrated circuits were invented in 1958 by Texas Instruments without 
military funding (military projects for miniaturizing electronic circuits all 
failed or came too late in the 1950s), but their mass production was made 
possible by the invention of the so-called planar process for silicon transistors 
by a group of eight physicists and engineers who left a company the insuf- 
ferable Shockley had founded in 1954. The Fairchild Company which, since 
the 1920s, made aerial cameras and later components of analog computers 
(all mostly for the military), set up for them the Fairchild Semiconductor 
Corporation in 1957. Since they had their eye on the commercial market — 
some of them founded Intel a few years later -, they rejected military R&D 
contracts to remain free of having to develop products which, although mil- 
itarily important, would be of little commercial interest. They nevertheless 
decided to concentrate first on the improvement and manufacture of high 
performance silicon transistors for the military market. This was the time 
the military was beginning to replace analog computers with digital ones in 
avionics and missiles because only silicon — and neither very expensive ger- 
manium, nor electronic valves — could stand the high temperatures, shocks 
and vibrations prevalent in many military systems. Their first customer was 
IBM which bought one hundred Fairchild “mesa” transistors at 150 dollars a 
piece for use in the navigational computer for the prototype of the B-70 su- 
personic bomber they had already made the analog computers for the B-52s, 
a much bigger market). They had no competitor other than Bell Labs, their 
mesa transistors immediately found many other avionics uses, and their sales 
jumped from 65,000 dollars in September 1958 to 2.8 MD for the first eight 
months of 1959. Their most important customer was Autonetics, in charge of 
developing the digital computer guidance system for the Minuteman missile. 
Other early uses included an air-to-air missile, a torpedo, and the Apollo 
space station. Problems of reliability led to the “planar process” to make 
much better transistors; the rate of defect-free components was 5% at first, 
but they were under such pressure from Autonetics, which demanded one 
year without failure, not to mention the now growing competition in mesa 
transistors, that they persisted, then developed ultra-reliable planar diodes 
for computers and eventually integrated circuits. The planar process made 
it possible to fabricate many components on the same silicon wafer and to 
connect them, again with a very low initial proportion of defect-free circuits. 
All of this looks very simple, but required extraordinary standards of clean- 
liness, manufacturing skills, and an unprecedented level of discipline on the 
workforce , as one of my sources said. 

Total sales of ICs amounted to 4 MD in 1962, 41 in 1964, 148 in 1966 
and 312 in 1968, while the average unit price dropped from 50 dollars to 
2.33; in those same years the military bought 100%, 85%, 53% and 37% of 
the total sales. More generally, the military part of the electronics industry’s 
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total sales, which was 24% in 1950, climbed to 53-60% during the years 
1952-1968. The general pattern in electronics at the time was that the first 
customers, namely the military and their industrial contractors, bought the 
initial product at prices which included most of the R&D and at least part 
of the tooling; prices then went down to a level which civilian industry and 
business could afford for their own uses, which in time lowered the prices 
again until the general public could buy solid-state gadgets like radios, TV 
sets or PCs. With a huge civilian market after 1980, chip makers like Intel 
could continue to improve their products with little help from the military; 
Intel even refused to work on highly sophisticated very high speed circuits 
(VHSIC) with no civilian uses. 

The military actually benefited from this civilian market as they too 
needed a lot of standard electronics that could be purchased off-the-shelf at 
low prices. For this reason and to help American industry against Japanese 
competition, they became interested in “dual” technologies with military and 
civilian uses. The DoD still spends about 25% of its R&D budget on electron- 
ics and communications, but for more sophisticated products than personal 
computers... 

The early development of computers was still more influenced by the mil- 
itary. Explaining it here would take too much space; see the Internet file. [’ll 
merely point out that the 35 computers made between 1945 and 1955 were 
entirely financed by the DoD, with the exception of two in universities which 
my source does not know, and of the von Neumann Princeton computer 
which was financed by the Army, Navy, AEC and RCA (but its five copies 
were financed by AEC or, at Rand, by the Air Force). Almost all of these 
machines were one of a kind; only three companies made several production 
units: UNIVAC, the company Eckert-Mauchly had founded in 1947 in order 
to make huge data-processing machines with the commercial market (bank- 
ing, insurance, etc.) in mind, although it also had military customers; ERA, 
founded by a team of former cryptologists from the Navy who made very 
advanced computers for the National Security Agency; and IBM which, at 
the start of the Korean War, decided to make digital machines. They looked 
for customers and found seventeen, either military or in the military indus- 
try. Of course a huge civilian market developed later — mainly after 1960 -, 
but the influence of military research contracts and procurement always was 
extremely powerful, and still is., 


References: Herman H. Goldstine, The Computer: From Pascal to 
von Neumann (Princeton UP, 1972), Kenneth Flamm, Targeting 
the Computer (1987, Brookings Inst. Press) and Creating the Com- 
puter: Government, Industry, and High Technology (1988, Brook- 
ings), Arthur L. Norberg and Judy E. O’Neill, Transforming Com- 
puter Technology. Information Processing for the Pentagon, 1962- 
1986 (1996, Johns Hopkins UP), Donald MacKenzie, Knowing Ma- 
chines (MIT Press, 1998), Janet Abbate, Inventing the Internet 
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(1999, MIT Press). National Academy of Sciences, Funding a Rev- 
olution. Government Support for Computing Research (NAS Press, 
1999), very explicit and thankful to the DoD, Alex Rolland & Philip 
Shiman, Strategic Computing: DARPA and the Quest for Machine 
Intelligence (MIT Press, 2002). 


Below industry level, all domains of science, from mathematics and computer 
science to nuclear physics, electronics, optronics,..., oceanography, geology 
(used e.g. for monitoring underground nuclear tests) and even to some extent 
biology and medicine, expanded tremendously since much of their results and 
many experts were needed in all domains of high technology and defense. 


§ 3. Applied mathematics in America 


In the entertaining chapter of his autobiography, Un mathématicien au 
prises avec le siecle (Paris, Odile Jacob, 1997, trad. Birkhaser), which he 
devotes to his teaching at the Ecole polytechnique, Laurent Schwartz ac- 
cuses (p. 355) the French pure mathematicians , and especially the Bourbaki 
group, of having ostracized their applied colleagues. As a matter of fact, for 
at least ten years there was nearly nobody to be “ostracized” before the rise 
of Jacques-Louis Lions (1928-2001), a very bright student of Schwartz who 
first worked on distributions and partial differential equations (PDEs) in the 
modern way made possible by the development of functional analysis. He 
discovered applied mathematics and computers in America in 1956 in cir- 
cumstances that will be explained below, and later founded the very brilliant 
French School of Applied Mathematics; he himself was appointed a professor 
at Nancy in 1954, in Paris in 1963, at the Polytechnique (1965-1986), and at 
the College de France in 1973. 

From 1980 to 1984, he headed the French government National Insti- 
tute for Research in Informatics and Automatics (INRIA) with which he had 
been connected for ten years, the French NASA (CNES) from 1984 to 1992, 
and he won some of the highest international prizes; quite a victim of our os- 
tracism, and otherwise a great mathematician with some 50 doctoral students 
and hundreds of “descendants” in the world. See a substantial biography by 
Roger Temam, one of his principal students, at www.siam.org/siamnews/07- 
O1/lions.htm. 

Schwartz decrees that every mathematician must concern himself with the 
applications of what he is doing without, it seems, being aware of the fact 
that “to concern oneself with” may have quite a number of different mean- 
ings, whether in French or in English. He provides neither a justification for 
his categorical imperative nor the slightest account of the very diverse appli- 
cations of mathematics. The fact that applied mathematics were undergoing 
a powerful expansion in the United States and USSR among others seems to 
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justify everything, without it being necessary to explain this strange and very 
new development in the two countries which led the arms race until 1990. 

The development of applied mathematics in the USA which so inspired 
Schwartz is not too difficult to explain, even though much remains to be 
done since physics and technology, being far more spectacular, have almost 
monopolized historians until now. The Soviet situation, although less well 
known, was certainly no better. 

Before the war, “pure” mathematics prevailed in universities everywhere 
(except in the USSR, since this “bourgeois” concept was anathema to Marx- 
ism); engineers and physicists almost always solved their mathematical prob- 
lems by themselves, even when the new quantum mechanics obliged physicists 
everywhere to rediscover strange mathematics. By the 1930s, the situation 
began to change in a few places, partly due to the arrival of European Jew- 
ish refugees. Richard Courant, Kurt Friedrichs, Fritz John and Hans Lewy 
brought to New York university some of the G6ttingen tradition founded by 
Felix Klein forty years before. They dealt less with applied mathematics as 
we know them — computers had yet to come — than with often “modern” 
mathematics such as found in Courant and Hilbert’s celebrated Methoden 
der Mathematischen Physik. In 1937, the Army Ballistic Research Labora- 
tory at Aberdeen set up a scientific committee including von Neumann and 
von Kfrmn besides other luminaries. Von Karman, formerly a student and 
later a competitor of Ludwig Prandtl, the foremost German aerodynamicist 
in Géttingen, had been at CalTech since 1934 (and part-time since 1926), 
where he founded the future Jet Propulsion Laboratory. In 1945 he became 
the Air Force’s main scientific advisor and, in this capacity, one of the first 
promoters of atomic missiles. Classical Calculus being often sufficient, the 
WW II military R&D organization did not at first enlist mathematicians. 
Mainly at the request of mathematicians themselves, an Applied Mathemat- 
ics Panel was set up in 1942 with teams in several universities put at every- 
one’s disposal; they were, so to speak, the coalers of the R&D Dreadnoughts 
of which the officers were physicists. Stanislas Ulam, who later became chief 
mathematician at Los Alamos, had to ask his friend von Neumann for his 
help in getting war work in 1943. Applied (or, as Saunders McLane said, 
applicable) mathematics, much of it boring, blossomed in all kinds of fields, 
and some people converted to it for life. Shock waves propagation, surface 
waves in water of variable depth, “hydrodynamics computations” for the 
Nagasaki bomb, gas dynamics, statistical optimization of air bombings and 
anti-aircraft defense, operational research, statistical quality control for the 
mass production of weapons, etc. For anti-aircraft defense, Norbert Wiener 
invented statistical prediction methods based on harmonic analysis and ana- 
lytical functions, but they were too sophisticated: he had been lured into the 
mathematics of the problem. Transmitting orders or conversations in a secure 
way, that is to say unintelligible to non-authorized people, was very difficult, 
particularly communications between such high level persons as Roosevelt, 
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Churchill or Eisenhower. This was intensively studied at Bell Labs, where 
digitalization of continuous speech was apparently invented, while separate 
frequency bands were encoded by adding random numbers and reduction 
modulo 6 (it took quite a while for Bell’s engineers to discover it, although 
they were familiar with mod 2 arithmetic); each encoding system was used 
only once, and recorded on two highly precise phonograph records, one of 
which was used at the sending end and the other sent in advance to the 
receiving end; this involved a lot of very complex electronics using kilowatts 
of power to transmit milliwatts of speech, and the help of some people with 
mathematical abilities which the electronics engineers lacked. One of them 
was Claude Shannon, until 1941 at MIT and Princeton where he had studied 
applications of “Boolean algebra”, i.e. set theory, to the analysis of electronic 
circuits; he derived from his work at Bell Labs the Information Theory that 
made him famous after the war. If you understand electrical engineering, 
see A History of Engineering and Science in the Bell System. National Ser- 
vice in War and Peace (1925-1975) (Bell Telephone Laboratories, 1978), pp. 
291-316. 

Most postwar standard mathematical publications, written by mathe- 
maticians who are too busy or too discreet to consult sources, contain only 
rather abstract and summary generalities about the relevant mathematics. 
But luck may help those who read books that mathematicians generally do 
not open, or know of, since they don’t deal with mathematics. 

The 1945 bombings on Japanese cities (and earlier ones on Germany) 
led to a fascinating problem: to determine the right proportion of explosive 
and incendiary bombs for maximum damage. A Berkeley statistician, Jerzy 
Neymann, was then called to help; he used methods which, after the war, 
made him a celebrity. Mathematical details are not to be found in my source, 
and it is likely that Neymann’s contribution was less useful than those of 
scientists, led by Harvard chemist Louis Fieser, who in 1942 invented napalm, 
among other incendiaries, though it was not widely used until the war in 
Korea. During a bombing raid, planes were supposed to drop bomb clusters 
at 50-foot intervals, which would open at 2,000 feet and disperse 38 smaller 
bombs, starting a dozen fires; thus a B-29 was able to set fire to a 350x2,000- 
foot area. Relying on statistical computations to get the best results would 
thus have been a good idea (or a bad one, depending on your point of view). 
But recent books suggest that the method was discovered experimentally. 

On the other hand, the task of choosing targets, based at first on their 
contributions to Japanese armaments, and of evaluating the weight of bombs 
needed, was conferred on a Committee of Operations Analysts which relied 
on methods developed in Britain, mostly by physicists like P.M.S. Black- 
ett, initially for anti-submarine warfare, then for bombing operations. These 
problems involved fairly simple mathematics but gave rise soon after the war 
(first of all at the Rand Corporation) to an extravagant amount of hype 
in favor of game theory, Operations Research, and linear or dynamic pro- 
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gramming; it was claimed they were the truly “modern” mathematics that 
could be applied to “solve the problems of society” — logistics, bombers bas- 
ing, optimizing a massive nuclear strike in case of war, dispatching packs of 
Coca-Cola to troops in the field or grocery stores, etc. No wonder these dis- 
ciplines, which were still rather primitive mathematics assisted by the first 
computers, did not attract everyone after the war even if they found harmless 
applications later: 


What are we to think of a civilization which has not been able to 
talk about killing almost everybody, except in prudential and game- 
theoretical terms, 


a good question Oppenheimer asked on TV in February 1950 or perhaps in 
1959 — my sources do not agree. 

In the atomic sector, where the most difficult problems were to be found, 
the development of the implosion bomb (Nagasaki, plutonium) forced theo- 
reticians, headed by Hans Bethe, to solve numerically the PDEs governing 
the propagation of the convergent shock wave produced by classical explo- 
sives surrounding a sub-critical ball of plutonium. At hundreds of thousands 
of bars of pressure, plutonium behaves like a viscous fluid which you have to 
keep perfectly spherical, whence a “hydrodynamics” problem as they called 
it. To get the needed spherical shock wave required an assembly of 32 pentag- 
onal pyramids of fast explosives, with a half-sphere (“lens”) of slow explosives 
in the middle of each one. Ready in the Spring of 1945 after thousands of 
tests, this device required solving countless problems by American and British 
experts in explosives, many of them academics. Von Neumann contributed 
significantly to this effort in recommending that much larger amounts of 
conventional explosives be used than was projected, as well as in the design 
of the explosive lenses; after having learned chemical engineering at Ziirich 
Polytechnicum in his youth, he had participated at Aberdeen in the devel- 
opment of “shaped charges” for anti-tank projectiles. Hans Bethe, a nuclear 
physicist who knew a lot of mathematics, wrote a 500- page report on shock 
waves at Los Alamos. 

To solve the two-dimensional PDE (three-dimensional computations were 
beyond them until the 1980s), they first used the same classical finite dif- 
ference method as for one-dimensional problems. It turned out that small 
variations in the dimensions of time and space steps led to large variations 
in the results: instability. Richard Courant was then called to the rescue. He 
explained to Bethe the successive approximations method that Friedrichs, 
Lewy and himself had used (Math. Annalen, 1928) to prove the existence of 
solutions: it prescribes non-obvious restrictions on the relative dimensions of 
the time and space steps used. It is at Los Alamos, it seems, that the first 
opportunities to use the method arose. Thanks to that, 
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very soon problems involving fluid dynamics, neutron diffusion and 
transport, radiation flow, thermonuclear reactions and the like were 
being solved on various machines all over the United States 


writes Bethe’s first successor as chief of theoretical physics at Los Alamos, D. 
Richtmyer, in a 1957 book explaining, among other things, advances made 
after the war by von Neumann and Peter Lax concerning the convergence and 
stability of approximations; Banach spaces could now be used indirectly to 
understand what went on inside a bomb, for obviously this is what everybody 
was interested in at the time in Los Alamos. Lax, who spent his summers at 
Los Alamos during the 1950s, was one of applied mathematics’ rising stars 
and, later, a strong opponent of Bourbaki’s mathematics. He once wrote of 
Vietnam war opponents who wanted to enlist the AMS that most of them 
specialize in branches of mathematics that are abstract, often esoteric, and 
completely unmotivated by problems of the real world , thus implying that, had 
they instead busied themselves with, say, the mathematics of shock waves, 
they would have had no qualms over B-52s flattening Laos... 

J-L. Lions, mentioned above, said much later in an interview (Le Monde, 
May 8, 1991) that he discovered applied mathematics and computers in 
America in 1956 thanks to Lax, who told him of von Neumann’s ideas; after 
mentioning a few current civilian applications, Lions treats us to an eulogy 
of von Neumann, 


the father of the discipline who, at the end of the 1940s, was so 
able to guess all the benefits that would result from the use of the 
first computers to describe such complex systems as meteorological 
phenomena, 


and that he himself only added one chapter which von Neumann had not 
entered: the industrial chapter (with enough success to be a member of the 
board of several big French industrial companies during his last years). Von 
Neumann’s (and the Air Force’s) interest in meteorology is well known but, 
as the reader already knows, he was interested in other uses of computers. 
By 1956, 


[his] combination of scientific ability and practicality gave him a cred- 
ibility with military officers, engineers, industrialists, and scientists 
that no one else could match. He was the clearly dominant figure in 
nuclear missilery. 


This other eulogy is from Herbert York Race to Oblivion (Simon & Schus- 
ter, 1970, p. 85); the was a member of the Teapot Committee which, chaired 
by von Neumann, chose in 1954-1955 the characteristics of ATLAS, the first 
intercontinental missile. Lions may not have been told in 1956 of von Neu- 
mann’s taste for military projects, but in 1960, the year he started a seminar 
on numerical analysis in Paris, his first “really applied” paper was on nu- 
clear reactors. That he did not even hint in a 1991 interview at the huge 
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military influence on the development of his discipline may be explained by 
the Russian principle: show the best, hide the rest . One of his best students, 
Roland Glowinski, tells us on the web that the A (for Automatics, i.e. Con- 
trol) of the IRIA Institute of Research in Informatics and Automatics that 
Lions headed had been suggested by Pierre Faurre. A bright Polytechni cien 
well known among applied mathematicians, Faurre published a book on the 
mathematics of inertial guidance (1971) in a collection directed by Lions. In 
America, this technique made Charles Stark Draper and his Instrumenta- 
tion Laboratory famous (it was the focus of student riots at MIT in 1969) 
and was developed first for strategic bombers, later missiles, and still later 
commercial planes; Faurre soon became the general secretary of SAGEM, a 
well-known company he eventually headed and which was making (among 
other things, e.g. telecommunication hardware and fire-control systems) in- 
ertial guidance systems for planes and missiles, whether civilian or military. 
One should not forget the multi-volume and multi-author treatise of Analyse 
mathématique et calcul numérique pour les sciences et les techniques (English 
trad. Springer) which Lions edited together with Robert Dautray, a Polytech- 
nicien who, from 1955 to 1998, followed a bright career at the French AEC 
(CEA) up to the highest position. Dautray was appointed scientific director 
of its Military Applications Division (DAM) in 1967 in order to help its engi- 
neers extricate themselves from the complexities of H-bomb design; it seems 
he did this by asking questions to a well-known British expert who told him 
they had found, but not recognized, the solution. To be sure, none of these 
connections proves that Lions did actual military work, and it may well be 
that he was mainly interested in applications to astronautics, meteorology, 
the environment, industrial processes, etc. Let me say simply that I have 
read too many biographies by scientists to trust them automatically to tell 
the whole truth. 

Richtmyer mentions “machines”. At Los Alamos in 1943, numerical com- 
putations were first carried out on mechanical desk computers — distant 
descendants of Pascal’s and Leibniz’s machines -, as everywhere else. The 
enormity of the task led physicists to order commercial IBM punch card ma- 
chines, improved to perform multiplications (!) and not merely additions. For 
months, Richard Feynmann headed dozens of (human) computers who had 
to push millions of punch cards into the machines. 

Von Neumann devoted two weeks to learning how to use them, which 
explains the shock that was his chance discovery, in 1944, of the Eckert- 
Mauchly team who, at the University of Pennsylvania, were designing the 
first electronic computing machine, ENIAC, to help the Aberdeen Proving 
Ground accelerate its firing-tables business; though not yet automatically 
programmable, ENIAC was far faster than IBM’s primitive machines were; 
it was not fully operational before the Fall of 1945 and was at once used 
for the H-bomb program, as was von Neumann’s own machine when oper- 
ational in 1952. Drawing in part on Eckert-Mauchly’s ideas, von Neumann 
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formalized in 1945-1946 what is now called the “von Neumann architecture”, 
thus creating true computers, and (slowly) built one at Princeton; Maurice 
Wilkes built one in Britain in 1948, Eckert-Mauchly delivered their first com- 
mercial UNIVAC in 1950, while another small company, ERA, delivered very 
advanced machines for cryptological work to the National Security Agency 
(NSA) also before 1950, as already said, all on von Neumann’s architecture. 
The Los Alamos and Livermore laboratories were first served with almost 
all the new “scientific” computers available, from copies of von Neumann’s 
machine to the present teraflop supercomputers, of which they were always 
the most demanding users and often the promoters. 

And while we are celebrating WW II applied mathematics in the United 
States, we might as well inquire about a country that is so often “forgot- 
ten” by most apostles of applied mathematics: Germany, which in some sci- 
entific and technical domains was well ahead of her enemies. At Géttin- 
gen, Prandtl’s lab had been transformed during WW I into an Aerody- 
namischen Versuchanstalt (AVA) which, in 1925, became associated with the 
newly founded and more theory- oriented Kaiser-Wilhelm Institut (KWI) fr 
Strémungsforschung. The arrival of the Nazis opened the way to the new 
Luftwaffe, which was good for aerodynamics, and AVA expanded. Prandtl, 
who was much more an innocent than a Nazi, congratulated them publicly for 
it while trying, without success, to protect valuable scientists who were not 
100% Jewish. Now running under the Luftwaffe ministry and almost entirely 
devoted to the needs of the aeronautical industry, AVA was separated from 
the KWI in 1937. Work at KWI, under Prandtl, while more “fundamental” 
than at AVA, was nevertheless increasingly devoted to studies for the Luft- 
waffe (high speed aerodynamics), or von Braun (supersonic aerodynamics), 
or the Navy (cavitation studies for fast torpedoes), as well as for meteorology. 
A young mathematician, Harry Gértler, took charge of numerical computa- 
tions and devised simple ways of programming them for KWI’s biological 
“computers”, young girls with a high school degree and desk machines. 

Outside fluid mechanics and ballistics, military research did not really 
start before 1942, when the Blitzkrieg myth was dispelled; as in 1914, most 
scientists had been mobilized like everyone else in 1939. Furthermore, Nazi 
Germany, a conglomerate of administrative feodalities fighting each other 
for power, lacked the centralized coordination of R&D that America set up 
even before Pearl Harbor. Most Nazi leaders, Hitler to begin with, could 
hardly understand the importance of revolutionary weapons, except for their 
psychological impact. The development of jet fighters was delayed by two 
years (fortunately for Allied bombers) and von Braun’s V-2s production, 
though not development, longer still. In 1943, they changed their mind and 
tried to develop “miracle weapons” in earnest; engineers had plenty of these 
on their drawing boards, but it was too late for most of them. 

Student numbers enrolling in aerodynamics and the like grew from a mere 
80 in 1933 to reach 700 by 1939, while the Nazi policy had the opposite effect 
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on mathematics — student enrolment fell by 90% at Gottingen — and physics, 
not only as a result of dismissing Jewish scientists, but also because the offi- 
cial ideology favored more virile prospects. In physics, mentioning “Jewish” 
Relativity theory was anathema, but most atomic physicists were not foolish 
enough to fall into this trap. There was also a “Deutsche Mathematik” gang 
trying to discredit some parts of mathematics and the mathematicians con- 
nected with it. Jewish- made transfinite numbers were fortunately not really 
needed to compute rocket trajectories. 

Often at their own request mathematicians were eventually mobilized for 
military research. In Germany as in Allied countries, it was thus possible to 
protect scientists from the chances of a Turkish bullet , a fate which had so 
incensed Ernest Rutherford when one of British physics’ rising stars, Philip 
Moseley, was killed in the Dardanelles in 1915 — a fate that should obviously 
be reserved for scientifically uneducated people. Some mathematical work 
remained rather theoretical, like Wilhelm Magnus’ first version of the Magnus 
and Oberhettinger book on special functions, Erich Kamke’s on differential 
equations, or Lothar Collatz’s on eigenvalue calculations. Other studies were 
more directly applied to supersonic aerodynamics of shells and missiles, wing 
flutter, pursuit curves for self guided projectiles, cryptology, etc. Some well 
known “pure” mathematicians, like Helmut Hasse, Helmut Wielandt, Hans 
Rohrbach, even converted to it temporarily. Alwin Walther, Courant’s former 
assistant, who before the war had founded a Practical Mathematics Institute 
(IPM) at the Darmstadt Teknische Hochschule, already worked for von Braun 
in 1939, and IPM became the main computing center for military research 
during the war. Walther’s first task after the war was to direct the writing 
for the Allies of five reports on mathematics; he pointed out the similarity 
of German and American areas of work, miraculously bearing witness to the 
autonomous life and power of mathematical ideas across all borders . Courant 
agreed and invited Walther to emigrate to the US; to this moving reunion — 
applied mathematicians of all countries, unite! — Walther, now a “pacifist”, 
preferred working for the reconstruction of his country. 

In Germany also, a remarkably clever engineer, Konrad Zuse, who had 
attended Hilbert’s lectures in mathematical logic, started in 1936, without 
any government help and ahead of the Americans, to build three comput- 
ing machines using telephone relays. The last one, Z3, became operational 
during the last months of the war and was used to control the shape of mass- 
produced rocket wings. All these machines were damaged during the war. 
Components of an electronic machine (which would have used 2,000 tubes 
instead of ENIAC’s 18,000) were built by his friend Wilhelm Schreyer; this 
aroused even less interest, and Schreyer later emigrated to Brazil to teach. 
At the end of the war, Zuse went to the Ziirich Poly where he built a Z4, 
much more reliable than the first electronic machines, then enjoyed a suc- 
cessful technical and business career in computers, later at Siemens. He also 
invented a Plan Kalkiil in 1945, i.e. a logical architecture for computers; but 
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he was not in a position to compete with von Neumann, if not in software, 
at any rate in prestige and support. 


References: Amy Dahan-Delmedico, L’essor des mathématiques ap- 
pliquées aux Etats-Unis: Vimpact de la seconde guerre mondiale (Re- 
vue (histoire des mathématiques, 2 (1996), pp. 149-213) and two 
papers by the same author and Peter Galison in Amy Dahan et Do- 
minique Pestre, eds, Les sciences pour la guerre, 1940- 1960 (Paris, 
EHESS, 2004), the first one dealing in detail with a Soviet team 
at Gorky. On Germany, see H. Mehrtens, “Mathematics and War: 
Germany, 1900-1945”, in Forman, National Military Establishments 
, Sanford L. Segal, Mathematicians under the Nazis (Princeton UP, 
2003), Konrad Zuse, The Computer. My Life (Springer, 1993). 


Going back to America, a long report on applied mathematics stated in 1956: 


Let it also be said at the outset that, with very few exceptions, their 
organization does not antedate World War II and their continued ex- 
istence is due to the intervention of the Federal Government. Without 
the demands resulting from considerations of national security, ap- 
plied mathematics in this country might be as dead as a door nail 


According to the report, government administrations — i.e., in those times, 
military de jure or, like AEC or NACA, de facto — and connected industries 
were practically alone in employing professional applied mathematicians. A 
1962 report claimed that in 1960, out of 9,249 “professional mathemati cians” 
employed in government or industry, about 2,000 were in federal military 
centers, 1,000 at the AEC, while aeronautics and electronics employed 1,961 
and 1,226 respectively in the private sector. These two fields consistently got 
about 60% and 25% of the federal R&D money going to industry. 

In 1968, another report — this one about mathematics in general — recom- 
mended that the so-called mission-oriented agencies, namely Defense, AEC, 
NASA and NIH in that order, should continue to fund research in those do- 
mains most useful to their missions, and to propose their problems to the 
mathematical community. This report was edited during the Vietnam War 
by Lipman Bers, one of the main opponents to the war among mathemati- 
cians. He explained in the 1976 Notices of the AMS that he had agreed to do 
it only after being assured that the war would end before the report’s pub- 
lication; it ended five years later. A 1970 report finds 876 mathematicians 
(166 with PhDs) at AT&T, 170 at Boeing, 239 at McDonnellDouglas, 147 at 
Raytheon, 68 at Sperry Rand, 287 at TRW, 137 at Westinghouse, etc. All of 
these high-tech companies had large military markets. 

In 1971, the DoD employed 81% of all mathematicians and statisticians 
employed by the government, 67% of all engineers, 41% of all physicists (but 
there was also the AEC), and 10% of all biologists and physicians. Serious 
work needing e.g. harmonic analysis, stochastic processes, information theory, 
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differential equations and PDEs, etc., was performed most of the time via 
university contracts. This is where historians should look to get a more precise 
idea of the importance of “higher” mathematics in military or industrial 
applications, a huge program. 

Applied mathematics and numerical analysis have many civilian applica- 
tions nowadays, but their degree of militarization always remained very high 
in the USA if we are to judge from the amount of federal funds attributed to 
them. The same is true a fortiori for what is now called computer science or 
informatics (logical architecture of machines, programming, networks struc- 
ture, etc., hardware excluded). Here is a simplified table, taken from NSF 
statistics, on the main sources of federal funds (in current MD) for basic and 
applied research (no development) in mathematics and computer science at- 
tributed to all public or private organizations concerned with these fields: 
Since one 1958 dollar is worth about six 2001 dollars, this means that our 


1958 1964 1968 1974 1980 =: 1987 1994 2001 


Total 40.4 98 119 127 241 799 = 1,242 2,810 
DoD 36.4 69 79 70 137 453 593 947 
NSF 1.4 11.4 18.6 24 53 124 238 569 
NASA 0 6.3 3.7 1.9 3.7 70 26 85 


AEC/DoE 1.9 5.1 5.8 5.6 11.6 38 201.8 824 


field got about twelve times as much money in 2001 as in 1958, while between 
1945 and 1950, it got about two million per year from ONR, a large part of it 
going to the Whirlwind computer. Here too the change of scale is stunning. 
The more recent increase in DoE funding is largely due to the development 
of 3D simulation methods for nuclear weapons, as well as to controlled fusion 
experiments designed to check the computations: 751 MD were allocated to 
it in 2004. The DoD was planning in 1998 to spend some 2.5 BD over several 
years on simulation and modelization. 

Separating mathematics and computer science yields interesting results. 
The funding of informatics was still comparatively low in 1958; in 1980, out 
of a total 241 MD, computer science got 128 MD and mathematics 90, the 
remainder being a mixture of both. In 2001, mathematics got 396 MD and 
computer science 2,022. The difference is, of course, still more striking in 
applied research, for which maths (resp. computer science) got 23.8 (resp. 
82) MD in 1980, then 95 (resp. 566) in 1994, then 105 (resp. 1,438) in 2001. 
The same year, Defense ARPA’s funding was 8.7 MD for mathematics and 
424 for informatics. All of these figures are from the NSF statistical series. 
A striking feature of this growth since the 1970s is the fact that basic re- 
search in computer science has been increasingly financed by the NSF and 
decreasingly by DoD, in part a consequence of Mansfield’s amendment (1970) 
prohibiting DoD from funding research without explicit military relevance, in 
part the result of an increasing number of relatively small standard contracts 
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with many new computer science departments which were then on the rise, 
while ARPA limited its grants to a few “centers of excellence”. It was also 
due to the financing of specialized and costly equipment in universities, e.g. 
supercomputing centers connected to other places. The end result was that 
in 2001, the NSF spent 119 MD for basic research in mathematics and 450 
in computer science. 

Obviously, not all funding goes to universi ties. The following table gives 
some idea of recent trends in the federal funding of research in universities 
(in current MD). 


1976 1984 1992 1999 


Total 57 182 478 662 
Mathematics 30 76 150 131 
Computer Sci. 26 74 320 506 


These data concern basic and applied research and represent a large part of 
the total, which also includes a small portion involving both sectors. For in- 
stance, in 1994, according to another NSF report which does not quite agree 
with the above data, the federal government attributed 196 MD to mathe- 
matics and 453 to computer science, while total expenses — funds specifically 
attributed to research by all sources — were 278 and 659 MD; this means that 
federal funds accounted for about two thirds of university research support 
in mathematics and computer science, the remainder being universities’ own 
funds and, presumably, industrial contracts at least in computer science. In 
2000, out of the total federal funding of university research in mathematics 
(resp. computer science) of 211.5 (resp. 568) MD, these fields got 29.5 (resp. 
209.8) from DoD, 8.9 (resp. 6.1) from DoE, 75.2 (resp. 0.5) from NIH (as 
against at most 12 MD before 2000), 0.7 (resp. 18.3) from NASA, and 99.6 
(resp. 336.6) from NSF. This is no longer the 1958 situation, when nearly all 
federal funds were military, and over 80% of military funding now goes to 
computer science. 

These statistics, mainly for the early years, do not accurately reflect the 
importance of activities specifically devoted to direct military work. Before 
the 1960s, when NSF hardly existed, military contracts went to many people 
who specialized in “useless” and “abstract” maths. These contracts allowed 
the universities to recruit more people, to help graduate and post-graduate 
students, to invite foreign colleagues, including perhaps the present author, 
and, last but not least, to secure America’s preponderance of power in mathe- 
matics as in everything else. However, it is not the bystander’s duty to prove 
that a military contract commits its beneficiary; it is up to the beneficiary 
who disputes it to prove that it does not. 

And how are we to explain that the life sciences sector, on the other hand, 
never benefited from proportionally equivalent DoD favors? In 1968, federal 
funding of life sciences totalled 1,534 MD, of which 105 came from the DoD; 
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in 1994, 9.3 BD, of which 265 MD; and in 2001, 23.057 billion in federal 
money, of which 1.052 billion from the DoD. Life sciences have been financed 
for fifty years essentially by the NIH (and, to a much lesser extent, by the 
NSF), and very strongly encouraged by Congress and the voters. As for the 
drug industry, it devotes billions to R&D without ever having received more 
than a few percentage points from the federal government, less than 4% in 
1993 for instance. In 2001, the industry spent a total of 12.2 BD, and since it 
belongs to the chemical industry sector, and the NSF tells us elsewhere that 
it got 150 MD in federal funds, an upper limit of 1.4% in federal funding for 
the drug industry follows. To be sure, drug companies indirectly benefit from 
their university contracts, but their main source of R&D money is obviously 
the countless products which are sold around the world to all who can afford 
them. 

After students rioted against the Vietnam War and military work in uni- 
versities, a Congressional Mansfield’s amendment forbade the DoD from fi- 
nancing research without a clear military interest, as already said. It was 
somewhat softened later, but its spirit remained, and military support of 
“pure” mathematics nearly vanished, except in cryptology. The main threat 
to “pure” mathematics now comes from the enormous development of applied 
mathematics, even though their applications may be mostly civilian. As we 
shall see in the next section, this is the most striking difference between 
post-WW II applied mathematics and Jacobi’s mathematics pour l’honneur 
de Vesprit humain (or for mathematicians’ entertainment...) which, to a very 
large extent, were preponderant from the 1820s to the eve of WW II. 
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